A-C
- Agent -
In data, an application that employs automation for specific tasks. In HUMINT, a person who engages in clandestine activities under the direction of an intelligence organization, but is not an officer, employee, or co-opted worker of that organization.
- AI or Artificial Intelligence -
A term coined by emeritus Stanford Professor, John McCarthy, and defined as the “science and engineering of making intelligent machines.” Today, it’s the theory and development of computer systems able to perform tasks that normally require human intelligence.
- Algorithm -
A finite sequence of mathematical instructions used as specifications for performing calculations and data processing.
- Analysis -
The process in the production step of the intelligence cycle in which information is interpreted for meaning, facts are derived, and conclusions are drawn.
- Asymmetric Cryptography -
A modern branch of cryptography in which the algorithms employ a pair of keys (a public key and a private key) and use a different component of the pair for different steps of the algorithm.
- Availability -
The assurance that authorized users can access and use information in a timely and reliable manner.
- Bias -
The result of systematic errors in ML models that favor certain groups and skew predictions, often due to flawed assumptions in the model training process.
- Biographical Intelligence -
The aggregate views, traits, habits, skills, relationships, and things of importance to an entity that are of interest to a potential or actual threat actor.
- Bit -
A contraction of binary digit, it is smallest unit of information storage.
- Byte -
A fundamental unit of computer storage equal to 8-bits and is the smallest addressable unit in a computer's architecture.
- Cipher -
A cryptographic algorithm for encryption and decryption.
- Cipher Text -
The encrypted form of a message being sent.
- CI or Counterintelligence -
The information gathered and activities conducted, to protect against espionage, other intelligence activities, sabotage, or assassinations conducted by or on behalf of foreign governments or elements thereof, foreign organizations, or foreign persons, or international terrorist activities, or organizations.
- Compromise -
The exposure of classified or sensitive information and/or activities to an unauthorized entity.
- Confidentiality -
In cybersecurity, the need to ensure that information is disclosed only to those who are authorized to view it.
- CSV or Comma Separated Values -
A plain text file format that stores tabular data in a structured manner.
- Cybersecurity -
The art of protecting networks, devices, and data from unauthorized access or criminal use and the practice of ensuring confidentiality, integrity, and availability of information.
D-F
- Data -
A raw observation of an event that can be structured, semi-structured, or unstructured. Data needs to be processed before it can be interpreted by humans for meaning.
- Data Engineering -
The process of extracting, storing, transforming, and loading of data to be made available to analysts and data scientists.
- Data Exploration -
The initial analysis and understanding of the structure and content of a dataset to identify patterns, anomalies, and trends.
- Data Governance -
A series of policies, standards, and practices for managing and ensuring the quality, integrity, and security of data within an organization.
- Data Ingestion -
The process of collecting and importing data into a data system or storage layer from various sources.
- Data Lake -
A centralized repository for the storage of (structured, semi, and unstructured) data at scale.
- Data Migration -
The process of transferring data from one system to another.
- Data Mining -
The process of extracting patterns, trends, and information from large datasets.
- Data Modeling -
The process of defining the structure of data and its relationship to other data in a database or system.
- Data Orchestration -
The coordination and management of dataflow across various systems, services, and processes to ensure that data workflows are executed in a controlled and organized manner.
- Data Pipeline -
A series of processes that move data smoothly from one system to another, typically involving multiple stages such as extraction, transformation, and loading.
- Data Quality -
A measure of how well a dataset meets the standards for accuracy, completeness, consistency, reliability, uniqueness, and timeliness.
- Data Redundancy -
An occurrence of duplicated data stored in multiple places within a database; however, this can be intentional.
- Data Science -
The academic field and discipline of extracting knowledge and insights from data.
- Data Warehouse -
An enterprise system used for the analysis and reporting of structured and semi-structured data from multiple sources.
- Data Workflow -
The series of steps, processes, and tasks involved in the end-to-end management and movement of data within an organization.
- Decryption -
The process of transforming an encrypted message into its original plaintext.
- Encryption -
The cryptographic transformation of “plaintext” data into “cipher text” that conceals the data's original meaning and prevents it from being known or used.
- Entity -
In cybersecurity, a distinct unit that can be identified and interacts with a computer system or network. This includes people, devices, software, or systems themselves.
- Espionage -
An intelligence activity directed towards the acquisition of information through clandestine means and proscribed by laws of the country against which it is committed.
- Exploitation -
The process of obtaining information by taking advantage of the source.
- Extract -
The engineering process of ETL for pulling data from a source.
- Filter -
The process and/or technique of specifying which data and information to keep versus discard.
- Foreign Intelligence Service or FIS -
An organization of a foreign government that engages in intelligence activities (i.e. espionage).
- Fusion -
The blending of intelligence from multiple sources to produce a single intelligence product.
G-I
- Geographical intelligence -
The aggregate locations, descriptions, and analysis of physical and cultural factors of the world as well as their changes over time.
- GPU or Graphics Processing Unit -
A specialized electronic circuit initially designed for digital image processing and to accelerate computer graphics by performing mathematical computations at high-speed.
- GPGPU or General Purpose GPU -
A GPU that is programmed to perform the computation in applications traditionally handled by the central processing unit. Their cores operate at lower frequencies than a CPU, making them efficient and ideal for performing parallel tasks. Today, they serve as the backbone for ML/AI.
- Graph Database -
A systematic collection of data that emphasizes the relationships between the different data entities via graph structures (i.e. nodes, edges, and properties).
- Honeypot -
In cybersecurity, a mechanism set to detect, deflect, or, in some manner, counteract attempts at the unauthorized use of information systems. In HUMINT, an operational practice involving the use of a covert agent (typically female), to create a sexual or romantic relationship in order to compromise a target.
- HUMINT or Human Intelligence -
A category of intelligence activities and information derived from humans.
- Human Source -
A person who has either wittingly or unwittingly shared information of potential intelligence value to an intelligence activity.
- Identity -
The thing for whom someone or what something is known as or for.
- Incident -
An adverse network event in an information system or network or the threat of the occurrence of such an event.
- Information -
The data that has been processed and can now be consumed and interpreted by analysts for meaning.
- Integrity -
In cybersecurity, the need to ensure that information has not been changed accidentally or deliberately, and that it is accurate and complete.
- Intelligence -
The arrangement of direct and indirect information, facts, and conclusions in order for an organization and its leaders to take follow-on actions.
J-L
- JSON or JavaScript Object Notation -
An open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays.
- Kernel -
The essential center of a computer operating system, or the core that provides basic services for all other parts of the operating system.
- Lakehouse -
A data management architecture that combines the best features of data lakes and data warehouses to create a single platform for storing and analyzing data.
- Least Privilege -
The principle of allowing users or applications the least amount of permissions necessary to perform their intended function.
- LLM or Large Language Model -
A computational model, trained on vast amounts of textual data, capable of interpreting, generating and/or manipulating the human language, at scale.
- Load -
The engineering process of ETL for staging data into a data warehouse or other unified data repository.
M-O
- Malicious Code or Malware -
A software or program application that appears to perform a useful or desirable function, but actually gains unauthorized access to system resources or tricks a user into executing other malicious logic.
- Masquerade -
A type of attack in which one system entity illegitimately poses as (assumes the identity of) another entity.
- MICE or Money, Ideology, Coercion, Ego -
In HUMINT, basic human motivations are delineated into four categories: Money (i.e. motivated by financial incentives), Ideology (i.e. motivated by a set of beliefs or philosophies), Coercion (i.e. motivated by use of force or threats), and Ego (i.e. motivated by a sense of self-esteem or self-importance).
- ML or Machine Learning -
A computational model capable of language generation or other natural language processing tasks. Also, a branch of AI that trains machines to imitate the way humans learn; in identifying patterns and making decisions.
- NLP or Natural Language Processor -
An ML technology that gives computers the ability to interpret, manipulate, and comprehend human language. Also, a branch of AI that enables computers to interpret, manipulate, and generate natural text to bridge the gap between computers and human language.
-NN or Neural Network -
A machine learning model that uses a network of interconnected nodes, or artificial neurons, to process data in a way that mimics the human brain.
- Non-Repudiation -
The ability for a system to prove that a specific user and only that specific user sent a message and that it hasn't been modified.
- Normalization -
A database design technique that aims to minimize data redundancy and dependency by organizing data into separate tables.
- OPSEC or Operational Security -
The process by which organizations protect data that if made public could create harm/damage.
- OSINT or Open-Source Intelligence -
The collection and analysis of data obtained from publicly accessible sources.
P-R
- Pandas -
A software library written for the Python programming language for data manipulation and analysis.
- Pharming -
A sophisticated form of MITM attack where a user’s session is redirected to a masquerading website.
- Phishing -
The use of e-mails that appear to originate from a trusted source to trick a user into entering valid credentials at a fake website. Typically the e-mail and the web site looks like they are part of a bank the user is doing business with.
- Processing -
The process of manipulating data (e.g. data transform) in order to make it consumable and useful for analysis. It can involve a variety of data operations such as: Collecting, Recording, Organizing, Filtering, Sorting, Analyzing, Storing, Retrieving.
- Python -
A high-level, general-purpose programming language.
- Q -
- Ransomware -
A type of malware that is a form of extortion. It works by encrypting a victim's hard drive denying them access to key files, which the victim must then pay a ransom to decrypt the files and (re-)gain access to them again.
- RBAC -
A control assigning users to roles based on their organizational functions and determines authorization based on those roles.
- Reconnaissance -
In cybersecurity, the phase of an attack where an attackers finds new systems, maps out networks, and probes for specific, exploitable vulnerabilities. In HUMINT, an operation undertaken to obtain by visual observation or other detection methods, information of interest.
- Risk -
The likelihood of an event happening and the impact of the event if it happens.
- Risk Assessment -
The process by which risks are identified and the impact of those risks determined.
- Rootkit -
A collection of tools (programs) that a hacker uses to mask intrusion and obtain administrator-level access to a computer or computer network.
S-U
- Sabotage -
In cybersecurity, the deliberate actions to harm an organization’s physical or virtual infrastructure, including noncompliance with maintenance or IT procedures, contaminating clean spaces, physically damaging facilities, or deleting code to prevent regular operations. In HUMINT, an action against material, premises, and/or utilities, or their production, which injures, interferes with, or obstructs the national security or ability of a nation to prepare for or carry out war.
- Schema -
The structure of a data types, database, or data warehouse, including tables, columns, relationships, and constraints, which serves as a blueprint for organizing and representing data.
- Semi-Supervised Model -
An ML approach using both labeled and unlabeled data to improve model performance and generalizations as well as extracting relevant features from data.
- Signals Analysis -
A technical discipline within SIGINT that seeks to identify the purpose, content and user(s) of signals (i.e. communications, electronics, and foreign instrumentation).
- SIGINT or Signals Intelligence -
A category of intelligence activities and information derived individually or in combination all communications, electronics, and foreign instrumentation signals. Arguably, the “OG” of big data.
- Smishing -
A combination of the terms "SMS" and "phishing" where fraudulent messages are sent over SMS (i.e. text messaging) rather than email.
- Social Engineering -
A euphemism for non-technical or low-technology means - such as lies, impersonation, tricks, bribes, blackmail, and threats - used to attack information systems.
- Source -
In data, the origin of a set of information, and can be the physical or digital location where data is stored. In HUMINT and SIGINT, a person, device, system, or activity from which intelligence information is obtained.
- Spark -
A multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
- Supervised Model -
An ML approach using labeled inputs/outputs to train or “supervise” algorithms into making accurate predictions.
- Spoofing -
A type of attack in which an unauthorized entity attempts to gain access to a system by posing as an authorized user.
- Surveillance -
The systematic observation or monitoring of people, places, and things by visual, aural, electronic, photographic, and/or other means.
- Symmetric Cryptography -
A branch of cryptography involving algorithms that use the same key for two different steps of the algorithm (such as encryption and decryption, or signature creation and signature verification). It is sometimes called "secret-key cryptography" (versus public-key cryptography) because the entities that share the key.
- Tamper -
An operational act to deliberately alter a system's logic, data, or control information to cause the system to perform unauthorized functions or services.
- Target -
A country, area, installation, organization, system, situation, signal, person, or other entity against which cyber and/or intelligence operations are conducted.
- Targeting -
The operational act of conducting cyber and/or intelligence operations against an organization and its people and systems for the purposes of gaining unauthorized access and collection.
- Target Analysis -
In HUMINT and SIGINT, an examination of potential targets to determine importance, priority of attack, and long-term forecasts on arising issues and potential concerns.
- Threat -
A potential for violation of security, which exists when there is a circumstance, capability, action, or event that could breach security and cause harm.
- Threat Assessment -
The process and analysis for the identification of the types of threats that an organization might be exposed to.
- Threat Intelligence -
In cybersecurity, a subfield that focuses on the structured collection, analysis, and dissemination of information regarding potential or existing cyber threats. In HUMINT and SIGINT, the collection, analysis, and dissemination of threats to the national security or ability of a nation to prepare for or carry out war.
- Training Data -
A large dataset used to train ML models to process information and accurately predict outcomes.
- Transform -
The engineering process of ETL for “cleaning” data whereby a series of rules or functions are applied to the extracted data in order to prepare it for loading.
- Trojan Horse -
A computer program that appears to have a useful function, but also has a hidden and potentially malicious function that evades security mechanisms, sometimes by exploiting legitimate authorizations of a system entity that invokes the program.
- Unsupervised Model -
An ML approach to analyze unlabeled data sets and discover hidden patterns in data without the need for human intervention. Also, useful for clustering data, anomaly detection, and/or cases where labeled data is absent.
- User -
A person, organization entity, or automated process that accesses a system, whether authorized to do so or not.
- User Contingency Plan -
The alternative methods of continuing business operations if IT systems are unavailable.
V-X
- VPN or Virtual Private Network -
A restricted-use, logical (i.e., artificial or simulated) computer network that is constructed from the system resources of a relatively public, physical (i.e., real) network (such as the Internet), often by using encryption (located at hosts or gateways), and often by tunneling links of the virtual network across the real network.
- Virus -
A hidden, self-replicating section of computer software, usually malicious logic, that propagates by infecting - i.e., inserting a copy of itself into and becoming part of - another program. A virus cannot run by itself; it requires that its host program be run to make the virus active.
- Vishing -
A type of phishing attack that involves the use of voice calls, using either conventional phone systems or Voice over Internet Procotol (VoIP) systems.
- War Driving -
The process of traveling around looking for wireless access point signals that can be used to get network access.
- Wire Tapping -
The monitoring and recording data that is flowing between two points in a communication system.
- Worm -
A computer program that can run independently, can propagate a complete working version of itself onto other hosts on a network, and may consume computer resources destructively.
- X -
Y-Z
- Y -
- Zero Day -
The “Day Zero” or the day a new vulnerability is made known and, in some cases, a "zero day" exploit is referred to an exploit for which no patch is available yet. ("day one" - day at which the patch is made available).
- Zombie -
A computer connected to the Internet that has been compromised by a hacker, a computer virus, or a trojan horse.