Artificial Intelligence at GPH

Terms and Definitions

Artificial Intelligence (AI):
A broad field of computer science (since the 1950s) focused on creating systems that simulate human-like intelligence—using algorithms and data to perform tasks such as visual perception, speech recognition, decision-making, and language translation.

Generative AI (GenAI):
A specialized branch of AI designed to create new content—text, images, audio, video, or code—by learning statistical patterns from training data. Key approaches include autoregressive models (e.g., GPT) and diffusion models (e.g., DALL·E, Stable Diffusion). Applications include AI writing assistants that help draft or improve text (such as ChatGPT, Jasper.ai, QuillBot, and Google Docs Smart Compose), and translation systems that generate fluent, context-aware translations.

Machine Learning (ML):
A subfield of AI in which algorithms “learn” patterns from labeled or unlabeled data to make predictions or decisions. ML workflows include data preparation, model training/validation, and deployment. Everyday examples: Siri/Alexa, Google Maps, Netflix recommendations, Nest thermostats.

Natural Language Processing (NLP):
An AI discipline at the intersection of linguistics and computer science that enables machines to interpret, analyze, and generate human language. Core tasks include sentiment analysis, named-entity recognition, parsing, summarization, question answering, language translation, and AI writing assistance. Common applications include Google Translate, DeepL Translator, Microsoft Translator, and writing tools like Grammarly.

Large Language Models (LLMs):
Advanced NLP models (typically transformer-based) trained on massive text corpora via ML. They predict the next word in a sequence to generate coherent, contextually relevant text and support tasks like few-shot learning, dialogue, and code generation (e.g., GPT-4o).

 

AI and Data Security

While AI technologies offer powerful capabilities and benefits, they also introduce significant data privacy and security concerns. Training AI models often requires large volumes of data, which can include sensitive, personal, or proprietary information. If this data is not properly secured, it may be vulnerable to unauthorized access, misuse, or data breaches, potentially exposing individuals or organizations to harm. 

GenAI can produce content that appears authoritative but is factually incorrect, fabricated, or based on outdated or biased sources—posing challenges in academic, journalistic, and professional settings. In academic contexts, relying on AI-generated outputs without proper verification can undermine scholarly integrity and lead to the spread of misinformation.

Falsified AI-generated content, such as deepfakes, synthetic text, or realistic audio, can be weaponized for malicious purposes like disinformation, fraud, identity theft, or social engineering attacks (e.g., sophisticated phishing). The use of AI in automated decision-making also raises questions about bias, fairness, and accountability.

 

How AI Gets Trained

AI models are trained by processing large amounts of data through algorithms that learn to recognize patterns and relationships within that data. During training, the model adjusts its internal parameters to minimize errors in its predictions or outputs. Common training methods include:

  • Supervised Learning: Using labeled data (inputs with correct answers) to teach the model, like training an image classifier with tagged photos.
  • Unsupervised Learning: Finding hidden patterns or groupings in unlabeled data, such as customer segmentation based on behavior.
  • Reinforcement Learning: Teaching models to make decisions through trial and error, guided by rewards or penalties, often used in robotics and game playing.
  • Transfer Learning: Starting with a pre-trained model and fine-tuning it on a smaller, specific dataset to improve performance on a particular task.

For GenAI and LLMs, training involves feeding the model massive datasets of  text, audio and visuals that is then converted to a code, enabling it to learn the statistical relationships between words, phrases, and concepts. By predicting the most likely next word in a sequence during training, LLMs develop the ability to generate coherent, contextually relevant text and perform a wide range of language-related tasks.

 

Chatbots

Chatbots are AI-powered conversational agents that use generative AI and natural language processing to simulate human-like interactions through text or voice interfaces. Examples like Google Gemini, ChatGPT, Claude, and Llama enable users to engage in dynamic conversations, assist with tasks such as drafting or editing emails, writing essays, generating code, answering questions, and more. These systems understand user inputs and generate contextually relevant responses in real time, making them useful for customer service, education, creative assistance, and productivity.

 

GoogleGemini at NYU

Google Gemini is a GenAI chatbot that can be used to create new content, streamline repetitive tasks, assist with communications, and more. Currently, Gemini is part of NYU Google Workspace Services but used as a standalone tool and not integrated directly into NYU Google apps such as Gmail and Google Docs. Gemini is available at gemini.google.com or via the Google App Launcher menu at the top right of most Google apps.

 

Google NotebookLM at NYU

NotebookLM is a personalized research assistant. You can upload various file types to the tool, ask questions about those files, receive answers with cited sources, and produce outputs such as summaries, briefing docs, timelines, FAQs, study guides, and audio overviews. Currently, NotebookLM is part of NYU Google Workspace Services but used as a standalone tool and not integrated directly into NYU Google apps such as Gmail and Google Docs. NotebookLM is available at notebooklm.google.com

 

Data Privacy in Google Gemini and NotebookLM

When you’re logged into Gemini or NotebookLM with your NYU NetID account, Gemini or NotebookLM will never train its AI model on your data. While it will save your past queries and results, they are only available to you, and you can delete them at any time, ensuring your data remains private to you.

Gemini and NotebookLM are considered an “Additional Service” under NYU’s Google Workspace for Education agreement, and its use operates under the Google Terms of Service and the Google Privacy Policy. Uploads, queries, and the tool’s responses will not be reviewed by humans or used to train AI models.

You should use Gemini and NotebookLM as you would NYU Google Drive/Email with regard to Data Privacy & Security Classification. Refer to the Electronic Data and System Risk Classification Policy to determine your data risk classification.