CS 314: Vocabulary

A*: a heuristic search algorithm that attempts to find a desired goal using a heuristic function to estimate the distance from a given node to the goal.

abstract data type: a description of operations on a data type that could have multiple possible implementations.

acyclic: describes a graph with no cycles (circular paths).

adjacency list: a representation of a graph in which each node has a list of nodes that are adjacent to it, i.e. connected to it by an arc.

adjacency matrix: a representation of a graph in which a boolean matrix contains a 1 at position (i,j) iff there is an arc from node i to node j.

ancestors: in a tree, the union of a node's parent and the parent's ancestors.

arc: a link between two nodes in a graph.

array: A contiguous block of memory containing elements of the same type, accessed by numeric index.

ASCII: (pronounced "ask-key") an abbreviation of American Standard Code for Information Interchange, a character code that maps between 8-bit binary integers and characters such as letters, numbers, and punctuation. There are 256 possible ASCII codes, of which 95 are printable.

association list: a list of pairs, where each pair has a key and a value associated with the key.

AVL tree: a self-balancing sorted binary tree, in which the heights of subtrees differ by at most 1.

B-tree: a tree with a high branching factor, to minimize the number of disk accesses required to access a desired record.

backtrack: in a tree search, to move back from the node currently being examined to its parent.

balanced tree: a tree in which the heights of subtrees are approximately equal.

bandwidth: information transfer rate of a network connection, in bits/second.

base case: a simple case that can be solved easily, without recursion.

Big O: an abstracted function that describes the amount of computer time or memory space required by an algorithm, as a function of problem size. For problems larger than a certain size, the actual time or space required will be less than the Big O multiplied by some constant.

bijective: describes a relation that is both injective and surjective (one-to-one and onto).

binary heap: a data structure that implements a complete binary tree within an array, such that every parent node has a value that is less than the value of either of its children.

binary tree: a tree in which each node has at most two children.

binary search: search of a binary tree or other structure, in which the size of the set to be searched is cut in half at each step.

binding: an association of a name with a value.

binding list: a list structure that represents a set of bindings.

bit: short for binary digit, the smallest unit of computer memory. A bit can have the value 0 or 1.

Boolean matrix: a matrix whose elements are 0 or 1.

boxed number: a number that is defined as an object, so that it has a runtime type and methods that can be used, e.g. Integer in Java.

branching factor: in a search tree, the number of children of a given node. Often, the branching factors of individual nodes will vary, so an average value may be used.

bucket: a collection, such as a linked list, of values that hash to the same value.

byte: an 8-bit piece of data, which can represent a character in a code such as ASCII.

bytecodes: the term used for the language of compiled Java. A given machine may have an interpreter for bytecodes (the JVM), or it may translate the bytecodes to native machine code so that it runs faster.

cache: to save a value locally to save re-computing or transferring it in the future.

Cartesian product: a set of pairs (x, y) of elements from two sets X and Y.

child: in a tree, a node pointed to by a parent node.

circularly linked list: a linked list in which the last element points back to the first element.

circular queue: a queue implemented within an array, where the first element of the array logically follows the last element.

class: in object-oriented programming, a description of a set of similar objects.

clustering: a situation in which many elements hash to the same hash value.

collision: when two values to be stored in a hash table have the same hash value.

comparison: the act of comparing two values to determine which is greater according to some ordering.

cons: 1. in Lisp, the function that constructs a pair of pointers, or basic element of list structure. 2. a cons data structure. 3. to make a cons data structure.

constructive: describes a function that makes a new data structure but does not modify its arguments.

CPU: Central Processing Unit, the "brain" of a computer, which performs operations on data.

critical path: in a PERT chart or scheduling graph, a path from the initial state to the goal such that any increase in time required along the critical path will increase the time to complete the whole project.

cycle: a circular path in a graph.

DAG: directed acyclic graph.

dense graph: a graph such that a large fraction of possible connections among nodes are present, i.e. the number of edges is of the order of the number of vertices squared. cf. sparse graph.

depth: the number of links between the root of a tree and the leaves.

depth-first search: a search in which children of a node are considered (recursively) before siblings are considered.

dereference: to convert from a pointer (address) to the data that is pointed to.

descendants: all nodes below a given node in a tree.

design pattern: a pattern that describes a set of similar programs.

destructive: describes a function that modifies its arguments.

DFS: depth-first search.

Dijkstra's algorithm: an optimal greedy algorithm to find the minimum distance and shortest path to all nodes in a weighted graph from a given start node.

directed: describes an arc that can only be traversed in one direction, or a graph with such arcs.

directed acyclic graph: a directed graph with no cycles. Every tree is a DAG, but a DAG may be more general.

discrete event simulation: a simulation in terms of events, in which the highest-priority (least time) event is removed from an event queue and executed, which may have the effect of scheduling future events.

divide and conquer: a problem-solving strategy in which a problem is broken down into sub-problems, until simple subproblems are reached.

domain: the set of values that are the source values of a mapping.

doubly linked list: a linked list in which each element has both forward and backward pointers.

edge: a link or arc between nodes in a graph.

exclusive or: a binary Boolean function whose output is 1 if its inputs are different. Abbreviated XOR.

external sort: a sort using external storage such as disk in addition to main memory.

fair: describes a process in which every arriving customer will eventually be served.

FIFO: first-in, first-out: describes the ordering of a queue. A queue is fair.

filter: a process that removes unwanted elements from a collection.

first-child/next-sibling: a way of implementing trees that uses two pointers per node but can represent an arbitrary number of children of a node.

fold: to process a set of items using a specified function; another term for reduce.

garbage: 1. data that is incorrect, meaningless, or random; 2. storage that is no longer pointed to by any variable and therefore can no longer be accessed.

garbage collection: the process of collecting garbage for recycling.

gedanken: describes a thought experiment or view of an entity.

geometric series: a series in which each successive term is multiplied by a constant less than 1, e.g. 1 + 1/2 + 1/4 + 1/8 + ...

goal: an item (or description of items) being sought in a search.

grammar: a formal description of a language in terms of vocabulary and rules for writing phrases and sentences.

gradient ascent: a method of finding the value x where f(x) is maximum by taking steps proportional to the gradient or slope of the function. Also called steepest ascent, or gradient descent or steepest descent if the minimum of the function is sought.

graph: a set of nodes and arcs connecting the nodes.

greedy algorithm: an algorithm that always tries the solution path that appears to be the best. "Eat dessert first."

hash function: a function that is deterministic but randomizing, i.e. whose output is a relatively small integer that appears to be a random function of the key value.

hashing with buckets: a hash table in which an item's hash value gives the index of a pointer to a bucket, an auxiliary structure containing the items with the same hash value. Using a linked list for a bucket is called separate chaining.

heuristic: a function that estimates the distance from a given node to the goal in A* search. More generally, a method that generally gives good advice about which direction to go or how to approach a problem.

heuristic search: A* search.

immutable: describes a data structure that cannot be changed once it has been created, such as Integer or String in Java.

in-place: describes a sort that does not require any additional memory.

injective: describes a mapping in which each element of the domain maps to a single element of the range. Also, one-to-one.

inorder: an order of processing a tree in which the parent node is processed in between its children.

interior node: a node of a tree that has children.

internal sort: a sort using only the main memory of the computer.

interpreter: a program that reads instructions, determines what they say, and executes them. The CPU is an interpreter for machine language; the JVM is an interpreter for compiled Java bytecodes.

intersection: given two sets, the intersection is the set of elements that are members of both sets.

intractable: a problem that is so hard (typically exponential) that it cannot be solved unless the problem is small.

iterator: an object containing data and methods to iterate through a collection of data, allowing processing of one data item at a time.

JVM: Java Virtual Machine, an interpreter for compiled Java bytecodes.

latency: the delay between asking for data from an I/O device and the beginning of data transfer.

leaf: a tree node containing a contents value but with no children.

LIFO: last-in, first out: describes the order of a stack.

linear: O(n), a problem whose solution requires a linear amount of time or space if the problem is of size n.

link: a pointer to the next element in a linked list.

linked list: a sequence of records, where each record contains a link to the next one.

load factor: in a hash table, the fraction of the table's capacity that is filled.

map: in MapReduce, a program that processes an element of the input and may emit one or more (key, value) pairs. In Java, a Map is a data structure that implements a mapping, such as HashMap or TreeMap

mapcan: in Lisp, a program that applies a mapping function to each element of a list of inputs, producing a list that concatenates corresponding results. The mapping function produces a list of results for each input, which allows it to produce multiple results or an empty result.

mapcar: in Lisp, a program that applies a mapping function to each element of a list of inputs, producing a list of corresponding results.

mapping: association of elements of a Range set with elements of a Domain set. We write M : R → D , for example PhoneDirectory : Name → Number .

master: a program that controls a set of other programs or devices.

max queue: a priority queue in which the maximum element is removed first.

memory hierarchy: the use of several kinds of memory hardware in a computer system, where the fastest memory (e.g. cache) is smallest, slower memory (e.g. RAM) is larger, and the slowest memory (e.g. disk) is largest.

memory locality: the processing of data in such a way that data that are located near each other by memory address are accessed nearby in time.

merge: to combine two ordered linear structures into one.

min queue: a priority queue in which the minimum element is removed first.

minimum spanning tree: a tree formed from the nodes of a graph and a subset of its edges, such that all nodes are connected and the total cost of the edges is minimal.

node: an element of a linked list, tree, or graph, often represented by a data structure.

null dereference: a runtime error that occurs when an operation such as a method call is attempted on a null pointer.

object: a data structure that can be identified at runtime as being a member of a class.

on-line: describes a sorting algorithm that can process items one at a time.

one-to-one: describes a mapping in which each element of the domain maps to a single element of the range. Also, injective.

onto: describes a mapping in which each element of the range is the target of some element of the domain. Also, surjective.

ontology: a description of the kinds of objects that exist in a computer program, e.g. a Java class hierarchy.

operator: in a search tree, a program that changes a state into a child state, e.g. a move in a game.

parent: in a tree, a node that points to a given node.

parsing: analysis of a sentence of a language to determine the elements of the sentence and their relationship and meaning.

path: a sequence of steps along arcs in a graph.

pattern: a representation of a class of objects, containing some constant elements in relation to variable elements.

pattern variable: a part of a pattern that can match variable parts of an input.

pivot: in Quicksort, a "center" value used in partitioning the set to be sorted.

pointer: a variable containing the address of other data.

postorder: an order of processing a tree in which the parent node is processed after its children.

predicate: a function that returns True or False. In Lisp, {\tt nil} represents False, and anything else represents True.

preorder: an order of processing a tree in which the parent node is processed before its children.

priority queue: a queue in which the highest-priority elements are removed first; within a priority value, the earliest arrival is removed first.

quadratic: O(n2), a problem whose solution requires a quadratic amount of time or space if the problem is of size n.

queue: a data structure representing a sequence of items, which are removed in the same order as they were inserted.

random access: describes a data structure or device in which all accesses have the same cost, O(1).

randomized algorithm: an algorithm in which the data to be processed or the deice to process it is randomly selected.

range: a set of values that are the targets of a mapping.

recursion: a case where a program calls itself.

recursive case: a condition of the input data where the data will be handled by call(s) to the same program.

Red-Black tree: a self-balancing binary tree in which nodes are "colored" red or black. The longest path from the root to a leaf is no more than twice the length of the shortest path.

reduce: to apply a given function to the elements of a given list. Also, fold.

reference: a pointer to data.

reference type: a type in which variables of that type are pointers to objects. In the code Integer i = 3, the variable i holds a pointer to the Integer object that contains the value. In int j = 3, the variable j contains the value. In Java, only reference types have methods.

rehash: to apply a different hashing function to a key when a collision occurs.

root: the top node of a tree, from which all other nodes can be reached.

row-major order: a way of storing a multiply-dimensioned array in memory, such that elements of a row are in adjacent memory addresses.

runtime stack: a stack containing a stack frame of variable values for each active invocation of a procedure.

scalability: the ability of an algorithm or hardware system to grow to handle a larger number of inputs.

scope: the area of program text over which a variable can be referenced.

search: to look through a data structure until a goal object is found.

sentinel: an extra record at the start or end of a data structure such as a linked list, to simplify the processing.

separate chaining: hashing with buckets, using a linked list to store the contents of a bucket.

set difference: given two sets, the set difference is the set of elements of the first set that are not members of the second set.

shadow: to hide similar items with the same name.

shortest path: the shortest path between a start node and a goal node in a weighted graph.

side-effect: any effect of a procedure other than returning a value, e.g. printing or modifying a data structure.

simple path: a path between two nodes in a graph that does not revisit any intermediate node.

slack: in a PERT chart or scheduling graph, the amount of time by which the time of an activity could be increased without affecting the overall completion time.

slave: a program or device that operates under control of a master.

sort: to modify the order of a set of elements so that a desired ordering holds between them, e.g. alphabetic order.

sparse array: an array in which most of the elements are zero or missing.

sparse graph: a graph in which any node is connected to relatively few other nodes. cf. dense graph.

spatial locality: being close together in space, i.e. memory address.

Splay tree: a self-balancing binary tree that places recently accessed elements near the top of the tree for fast access.

stable: describes a sort algorithm in which the relative position of elements with equal keys is unchanged after sorting.

stack frame: a section of the runtime stack holding the values of all variables for one invocation of a procedure.

stack space: the amount of space on the runtime stack required for execution of a program.

state: a description of the state of a process, such as a board game.

structure sharing: a case where two data structures share some elements.

successor: the next element in a linked list.

surjective: describes a mapping in which each element of the range is the target of some element of the domain. Also, onto.

symbol table: a data structure that links names to information about the objects denoted by the names.

tail recursive: a function whose value either does not involve a recursive call, or is exactly the value of a recursive call.

taxonomy: a classification of objects into a tree structure that groups related objects.

temporal locality: being close together in time, i.e. memory accesses that occur within a short time of each other.

topological sort: a linear ordering of nodes of an acyclic graph, such that a node follows all of its graph predecessors in the ordering.

tree rotation: changing the links in a binary tree to change the relative heights of the child subtrees, while leaving the sort order of the tree unchanged.

undirected: describes a graph in which the arcs may be followed in either direction.

Unicode: a character code that maps between binary numbers and the characters used in most modern languages, more than 110,000 characters. The lowest values of the UTF-8 encoding of Unicode are the same as ASCII, allowing characters to be 8 bits when staying within the ASCII character set. For other languages, more bits are used. Java uses Unicode.

union: given two sets, the union is the set of elements that are members of either set.

unparsing: converting an abstract syntax tree into a sentence in a language, such as a programming language.

vertex: a node in a graph.

virtual machine: an abstract computer that is simulated by an interpreter program running on an actual computer.

weight: a number that denotes the cost of following an arc in a graph.

well-founded ordering: an ordering that can be guaranteed to terminate, e.g. starting at a positive integer and counting down to 0.

word: a group of bits that are treated as a unit and processed in parallel by a computer CPU. Common word sizes are 32 bits and 64 bits.

XML: eXtensible Markup Language, a way of writing data in a tree-structured form by enclosing items in pairs of opening and closing tags, e.g. <zipcode> 78712 </zipcode>

XOR: exclusive or.

CS 314