Artificial Intelligence/ Machine Learning Explained

Learn what Artificial Intelligence and Machine Learning are, how they work, and why they matter. A clear, beginner-friendly explanation with real-world examples.

Artificial Intelligence/ Machine Learning Explained

Artificial Intelligence/Machine Learning– Explained

AI is a once-in-a lifetime commercial and defense game changer

Hundreds of billions of dollars in public and private capital are now being invested in Artificial Intelligence and Machine Learning companies worldwide. The number of AI-related patents filed in 2021 was more than 30 times higher than in 2015, reflecting a global recognition by governments and businesses that these technologies will be profoundly disruptive. Beyond transforming industries, AI and Machine Learning have the potential to reshape geopolitical dynamics and even alter the balance of military power.

For years, expectations around AI outpaced its real-world capabilities. Today, that gap has narrowed dramatically. Recent breakthroughs across several critical areas have enabled AI systems to match—and in some cases surpass—human performance.

AI and the DoD

The U.S. Department of Defense recognizes Artificial Intelligence as a foundational set of technologies critical to future operations. As a result, it established a dedicated organization—the Joint Artificial Intelligence Center (JAIC)—to accelerate the adoption and implementation of AI across the Department. The JAIC provides the infrastructure, tools, and technical expertise needed for Department of Defense teams to successfully develop, deploy, and scale AI-enabled solutions.

This document later outlines several specific defense-related applications of Artificial Intelligence.

We’re in the Middle of a Revolution

Imagine it’s 1950, and you’ve traveled back in time from the present day. Your task is to explain the future impact of computers on business, defense, and society to an audience still relying on manual calculators and slide rules. You manage to convince a single company and a government to adopt computers early and to learn programming far ahead of their competitors and adversaries. They quickly discover how to digitally transform their operations—optimizing supply chains, enhancing customer interactions, and automating decision-making.

Now fast forward to today. Consider the overwhelming advantage that early adoption would have created in both commerce and national power. Such an organization or nation wouldn’t just compete—it would dominate.

That moment is where we stand today with Artificial Intelligence and Machine Learning. These technologies are poised to fundamentally transform businesses and government agencies alike. Already, hundreds of billions of dollars in private capital have been invested in thousands of AI startups, accelerating innovation at an unprecedented pace. Recognizing the strategic importance of AI, the U.S. Department of Defense has established a dedicated organization to ensure its rapid and effective deployment across the Department.

But What Is It?

Compared to the traditional computing paradigm that has dominated the past 75 years, Artificial Intelligence has given rise to an entirely new class of capabilities. These include novel applications such as facial recognition; new algorithmic approaches like machine learning; new computational models such as neural networks; specialized hardware like GPUs; and new roles for software professionals, including data scientists. Together, these elements fall under the broad umbrella of artificial intelligence.

At first glance, this convergence can feel like “buzzword bingo.” In reality, it represents a fundamental shift in what computers can do, how they perform those tasks, and the types of hardware and software required. This brief aims to clarify these changes and explain their significance.

New Words to Define Old Things

One reason the world of Artificial Intelligence and Machine Learning can feel confusing is that it has developed its own language and vocabulary. New terms are used to describe programming concepts, job roles, development tools, and workflows. However, once these ideas are mapped back to the familiar concepts of traditional computing, the landscape becomes much easier to understand.

Below is a brief list of key definitions to establish a common foundation.

AI/ML – A shorthand term for Artificial Intelligence and Machine Learning.

Artificial Intelligence (AI) – A broad, umbrella term used to describe “intelligent machines” capable of solving problems, making or recommending decisions, and performing tasks that have traditionally required human intelligence. AI is not a single technology, but rather a constellation of related approaches and systems.

Machine Learning (ML) – A subfield of Artificial Intelligence in which humans combine data with algorithms to train computational models. Once trained, these models can make predictions on new, unseen data—such as identifying whether an image contains a cat, a dog, or a person—or support decision-making processes like understanding text and images. Unlike traditional software, machine learning systems are not explicitly programmed for each task; instead, they learn patterns and behaviors from data.

Machine Learning Algorithms – Computer programs designed to improve their performance as they are exposed to increasing amounts of data. The “learning” aspect of machine learning refers to the ability of these algorithms to modify how they process information over time. In practice, a machine learning algorithm adjusts its internal parameters based on feedback from prior predictions, enabling it to become more accurate when analyzing data such as images, text, or other complex inputs.

Deep Learning / Neural Networks – A subfield of Machine Learning in which neural networks form the core technology. The term “deep” refers to the multiple layers within a neural network that progressively extract higher-level features from data. Neural networks are particularly effective at tasks such as image classification, speech recognition, and natural language processing.

In deep learning, neural network algorithms are trained on massive volumes of data and given a specific objective, such as classification or pattern recognition. The resulting models can perform highly complex tasks, including identifying objects within images and translating speech in real time. While often described conceptually, neural networks are ultimately implemented on physical hardware, typically specialized processors optimized for parallel computation.

Data Science – A relatively new field within computer science that focuses on building data systems and processes to collect, manage, analyze, and derive meaning from large and complex datasets. In the context of Artificial Intelligence, data science encompasses the practices and methodologies used to support and develop machine learning solutions.

Data Scientists – Professionals responsible for extracting insights from data to inform business and organizational decision-making. They explore, analyze, and model data using machine learning tools and platforms to generate predictions and insights about customers, operations, risks, or other areas of interest.

What’s Different? Why Is Machine Learning Possible Now?

To understand why Artificial Intelligence and Machine Learning are capable of today’s breakthroughs, it helps to compare them with the computers that existed before AI emerged. (The examples below are intentionally simplified.)

Classic Computers

For more than 75 years, computers—referred to here as classic computers—have evolved dramatically in size and scale. They have shrunk to fit in our pockets, such as smartphones, and expanded to fill massive cloud data centers spanning entire warehouses. Despite these changes in form and capacity, they have continued to operate in fundamentally the same way.

Classic Computers – Programming

Classic computers are designed to perform only the tasks that humans explicitly instruct them to carry out. Programmers write software code to create applications by defining, in advance, all of the rules, logic, and knowledge required for the program to produce a specific outcome. These instructions are precisely encoded using programming languages such as Python, JavaScript, C#, or Rust.

Classic Computers – Compiling

Once written, the source code is compiled—translated by software into a form that can be executed on a target device such as a computer, web browser, or smartphone. In most traditional software development workflows, the machine used to write and compile the code does not need to be significantly more powerful than the device that ultimately runs the program.

Classic Computers – Running / Executing Programs

Once a program has been written and compiled, it can be deployed and executed across a wide range of environments, including desktop computers, smartphones, web browsers, data center clusters, and specialized hardware systems. These programs may take many forms, such as games, social media platforms, office productivity tools, missile guidance systems, cryptocurrency mining software, or operating systems like Linux, Windows, and iOS.

Regardless of their purpose, these applications all run on the same fundamental class of traditional computer architectures for which they were originally programmed.

Classic Computers – Software Updates and New Features

In traditional software development, programmers continuously maintain applications after deployment. They respond to bug reports, monitor for security vulnerabilities, and release regular software updates to fix defects, improve performance, and occasionally introduce new features. All changes are explicitly designed, coded, and tested by human developers.

Classic Computers – Hardware

The central processing units (CPUs) used to develop and run classic computer applications share a common architectural foundation. These processors are designed to execute a broad variety of tasks efficiently in a largely serial manner. Examples range from Intel x86 processors and ARM-based cores such as those found in Apple’s M1 system-on-a-chip, to high-performance processors like IBM’s z15 mainframe CPU.

Machine Learning

In contrast to traditional programming based on fixed, hand-coded rules, machine learning allows computers to “learn by example.” Instead of explicitly defining every rule, humans provide large volumes of labeled data from which algorithms discover patterns on their own. For example, in image recognition tasks, a common rule of thumb is that a machine learning system may require thousands of labeled examples per category to achieve strong performance.

Once trained, a machine learning model operates independently, making predictions or complex decisions without further explicit programming.

Just as classic software development follows three primary stages—coding, compiling, and executing—machine learning follows a comparable lifecycle: training (teaching the model), pruning or optimization (refining it), and inference (using the trained model to make predictions).

Machine Learning – Training

Unlike traditional programming, which relies on explicitly defined rules, training is the process of “teaching” a computer to perform a task—such as recognizing faces, detecting signals, or understanding text. This is why users are often asked to identify traffic lights, crosswalks, stop signs, or buses, or to transcribe distorted text in CAPTCHA challenges; these interactions help generate labeled data used to train machine learning models.

During training, humans supply massive volumes of labeled training data and select appropriate algorithms that iteratively adjust themselves to achieve the most accurate and optimized outcomes. In general, more high-quality data leads to better model performance.

(For a detailed, step-by-step explanation, see the machine learning pipeline later in this section.)

By running an algorithm selected by a data scientist on a set of training data, a machine learning system generates the internal rules that become embedded within a trained model. Instead of being explicitly programmed, the system learns directly from examples provided in the data. (See the Types of Machine Learning section for additional detail.)

This self-correcting process is a key reason machine learning is so powerful. When a neural network receives an input, it produces a prediction about what that input represents. The network then compares its prediction to a known “ground truth,” effectively asking an expert, “Did I get this right?” The difference between the prediction and the correct answer is measured as error. That error is propagated backward through the model, and the network adjusts its internal weights in proportion to how much they contributed to the mistake.

It’s worth emphasizing this point: the combination of algorithms and training data—not external human programmers—creates the rules the AI ultimately follows. The resulting model can perform highly complex tasks, such as recognizing objects it has never encountered before, translating text or speech, or coordinating the behavior of a drone swarm.

Rather than building a machine learning model from scratch, organizations can now obtain pre-trained models for many common tasks from third parties. This approach is similar to how chip designers license intellectual property (IP) cores instead of designing every component themselves, significantly reducing development time and cost.

Machine Learning – Training Hardware

Training a machine learning model is extremely computationally intensive. AI workloads require hardware capable of performing vast numbers of mathematical operations—particularly multiplications and additions involved in matrix multiplication. To achieve practical training times, specialized processors are used that are optimized for this type of parallel computation. (See the AI hardware section for additional details.)

Machine Learning – Model Simplification (Pruning, Quantization, and Distillation)

Just as traditional software must be compiled and optimized before deployment, machine learning models are refined after training to reduce their computational and resource requirements. Techniques such as pruning, quantization, and distillation simplify models so they consume less processing power, energy, and memory while maintaining acceptable performance.

Machine Learning – Inference Phase

Once a model has been trained and optimized, it can be deployed and replicated across multiple devices. During the inference phase, the hardware runs the trained model to make predictions or decisions on new data it has never encountered before.

Machine learning inference is often performed on devices located close to where data is generated—such as routers, sensors, and Internet of Things (IoT) devices. Running models at the “edge” reduces network bandwidth demands and minimizes latency, enabling faster, more reliable responses.

Machine Learning – Inference Hardware

Inference, or running a trained model, requires significantly less computational power than training. However, inference performance and efficiency are still greatly improved by specialized AI hardware designed to accelerate these workloads.

Machine Learning – Performance Monitoring and Retraining

Similar to traditional software systems that receive regular updates to fix bugs, improve performance, and add new features, machine learning models must also be maintained over time. This maintenance typically involves adding new data to existing training pipelines and retraining models periodically.

Without regular updates, machine learning models can become stale. Their real-world accuracy often degrades as conditions change—a phenomenon known as data drift or concept drift. To remain effective and safe, models must be continuously monitored for performance degradation, harmful or biased predictions, and changing data patterns. Retraining allows models to re-learn from recent data that more accurately reflects current reality.

One Last Thing – Verifiability and Explainability

Understanding how an AI system arrives at its decisions is essential for building trust and confidence in production deployments. Verifiability and explainability help stakeholders assess model behavior, identify errors or biases, and ensure AI systems operate in a reliable and accountable manner.

Neural networks and deep learning differ from many other machine learning approaches in that they offer relatively low explainability. While these models can generate highly accurate predictions, it is often difficult to understand or clearly explain how they arrived at a particular result. This challenge is commonly referred to as the “explainability problem.”

Although this issue is sometimes portrayed as a limitation of all artificial intelligence, it primarily affects neural networks and deep learning systems. Other machine learning techniques—such as decision trees and rule-based models—tend to be far more transparent, making their decision processes easier to interpret and audit.

For a deeper exploration of this challenge and potential solutions, the results of the DARPA Explainable AI Program provide valuable insights.

So What Can Machine Learning Do?

After decades of research and development, machine learning has reached a point where—even in its simplest implementations—it can perform certain tasks better and faster than humans. Today, machine learning is most mature and widely deployed in three core areas: processing text, understanding images and video, and detecting patterns or anomalies in data.

Recognize and Understand Text (Natural Language Processing)

Machine learning systems now outperform humans on several standardized reading comprehension benchmarks, and their performance on more complex language tasks is rapidly approaching human levels. As a result, Natural Language Processing (NLP) has become one of the most successful and commercially adopted applications of AI.

Common applications include automated translation, email and document autocomplete, conversational chatbots, and text summarization systems. These capabilities enable machines to read, interpret, generate, and respond to human language at scale, transforming how people interact with software and information.

Write Human-Like Answers and Assist in Writing Computer Code

Modern AI systems can generate original text that is often indistinguishable from human writing. They can answer questions, draft documents, summarize content, and assist with creative or technical writing. These systems are also capable of generating computer code, helping developers write functions, debug software, and learn new programming languages more efficiently. As a result, AI has become a powerful productivity tool for writers, engineers, and knowledge workers alike.


Recognize and Understand Images and Video Streams

AI systems can “see” and interpret visual information in images and video. They are able to identify objects, detect features, recognize faces, and analyze text embedded within video streams. These capabilities are used in a wide range of applications, including threat detection at airports, banks, and large public events; medical imaging analysis such as interpreting MRIs or supporting drug discovery; and retail analytics, where in-store imagery is analyzed to track inventory movement and customer behavior.


Detect Changes in Patterns and Recognize Anomalies

AI excels at identifying patterns across massive volumes of data and detecting deviations from expected behavior. These anomaly-detection systems can uncover signs of cyberattacks on financial networks, detect fraud in insurance claims or credit card transactions, identify fake reviews, and flag sensor data in industrial environments that may indicate safety or equipment issues. This ability to surface rare but critical events makes AI especially valuable in high-risk and high-scale systems.


Power Recommendation Engines

AI-driven recommendation systems analyze user behavior to suggest relevant products, services, or content. In e-commerce and digital platforms, these systems use browsing history, purchase patterns, and preferences to deliver personalized recommendations that improve user experience and drive engagement. Voice assistants and smart platforms also rely on these models to anticipate user needs and provide timely, relevant suggestions.

Recognize and Understand Your Voice

AI systems can understand spoken language, interpret meaning, and recognize context in real time. This capability enables conversational interfaces that allow people to interact naturally with machines. Voice-enabled AI can support chatbots in holding fluid conversations, record and transcribe meetings, and convert speech into searchable text. Some advanced systems can even analyze lip movements alongside audio to improve accuracy in noisy environments.

These technologies power widely used voice assistants from companies such as Apple, Amazon, and Google, and are increasingly integrated into customer service, productivity tools, and accessibility solutions.

Details of a Machine Learning Pipeline

The following section outlines a representative machine learning workflow, often referred to as a pipeline. This pipeline illustrates the steps data scientists follow to develop, deploy, and maintain a machine learning model over its lifecycle. Each stage plays a critical role in ensuring the model performs accurately, reliably, and safely in real-world environments.

What's Your Reaction?

like
0
dislike
0
love
0
funny
0
angry
0
sad
0
wow
0