APS 重点复习课程之人工智能导论
Overview
Brief Introduction
Logic and Reasoning
Searching
Supervised Learning
Unsupervised Learning
Statistics and Machine Learning
Deep Learning
Reinforcement Learning
Game Theory
Brief Introduction
Symbolism, connectionism(Deep Learning), and behaviorism(Reinforcement Learning)
Logic and Reasoning
Propositional calculus
命题逻辑
Basic
Every proposition is True or False.
connectives 逻辑连接词
and, or, not, conditional, bi-conditional
\and,\or,\neg,\rightarrow,\leftrightarrow
compound proposition 复合命题
Proposition Logic
, Contraposition
(a\Rightarrow b)\equiv(\neg a\or b)
\neg(a\and b)\equiv(\neg a\or\neg b), Law of De Morgan 1
\neg(a\or b)\equiv(\neg a\and\neg b), Law of De Morgan 2
, Modus Ponens 假言推理
(a_1\and a_2\and\cdots\and a_n)\Rightarrow a_i, And-Elimination
(a_1,a_2,\cdots,a_n)\Rightarrow(a_1\and a_2\and\cdots\and a_n), And-Introduction
, Double-Negation Elimination
(a\or b,\neg b)\Rightarrow a, Unit Resolution
Normal Form
CNF, Conjunctive Normal Form: (\or)\and(\or)\and(\or)
DNF, Disjunctive Normal Form: (\and)\or(\and)\or(\and)
Predicate logic
谓词逻辑 aka First-order logic
individual, predicate, quantifier
universal quantifier
existential quantifier
Laws
Universal Instantiation, UI:
Universal Generalization, UG:
Exsitential Instantiation, EI:
Exsitential Generalization, EG:
Knowledge Graph
A graph contains multi-relation.
Each edge is a triplet (left node, relation, right node)
Causal Inference
因果推理
Simpson’s Paradox: a relationship between two variables appears to be present in different groups of data, but disappears or reverses when the groups are combined.
Searching
Basic
State, Action, State transfer, Path/Cost, Target Judge
Search Tree
Heuristic Search
启发式搜索
evaluation fucntion , heuristic function .
Greedy best-first search:
A* search: , g(n) refers to cost from source to current node.
Adversarial Searh
aka Game search.
In zero-sum Game.
Each agent maximize self interest, minimize opponents’.
Minimax Search
Traverse every possible situation before current state, compute every win possibility of every situation, choose the action with highest possibility.
If size of search tree is large, minimax search can’t finish in time.
Alpha-Beta Pruning
decrease the number of nodes.
Alpha: Max pay of MAX node currently, init. -infty
Beta: Min pay to opponent of MIN node currently, init. +infty
Pruning
- for alpha: if n is MIN node, if one of the successor of n provides pay less than alpha, then discard n and its successors.
- for beta: if n is MAX node, if one of the successor of n provides pay more than beta, then discard n and its successors.
Monto-Carlo Tree Search MCTS
Monte-Carlo programming in single state
Multi-armed slot machine
k controllers, .
Minimize variance of regret function.
Upper Confidence Bound Strategies, UCB
refers to in past t-1 times, average reward of i-th controller.
refers to in past t choices, times of choosing i-th controller.
C is balance factor, to balance exploitation and exploration.
4 steps of MCTS
- selection. Choosing the node L with largest value of UCB.
- expansion. If L is not terminate, choose its unvisited successor C stochastically.
- simulation. From C, simulate game to terminal.
- back-propagation. Updating win times and visit times of each node in order.
MCTS is based on sampling, not traverse every state.
Supervised Learning
Machine Learning Basis
Learning from data.
mapping function .
supervised, unsupervised, reinforcement.
data-annotation, model training, loss function.
Regression Analysis
Linear regression
,
Logistic regression
Sigmoid function .
Decision Tree
Leaf nodes: decision rules.
Info entropy: , the lower the more confident.
Linear Discriminant Analysis
Classification.
Multi-class: dimensionality reduction.
Ada Boosting
自适应提升
Divide and conquer: Decompose a difficult classification into several easy sub-task.
Combine serval weak classifiers into strong classifier.
for each weak classifier, change training factor.
weight of weak classifier :
Linear combination into string classifier:
if error rate of a weak classifier is , it can be considered as random classifier.
Support Vector Machine SVM
structural rick minimization.
Generative Learning Models
GAN
Unsupervised Learning
K-means Clustering
Both data feature and similarity function are important.
, the less, the more similar.
Process of k-means
- Initialize the centroids of the k clusters randomly.
- Iterate over the data points and assign each data point to the cluster with the closest centroid, based on some measure of distance (such as Euclidean distance).
- Recalculate the centroids of each cluster as the mean of the points assigned to that cluster.
- Repeat steps 2 and 3 until the centroids converge, or until a maximum number of iterations is reached.
- Return the final clusters and their centroids.
Another View
Minimize variance of each claster.
Principal Components Analysis PCA
Occam’s Razor: the simplest explanation is usually the correct one.
Correlation coefficient, Covariance
Eigenface
One of PCA method.
Feature can represent face, not pixels.
- 32*32 trans to 1024*1
- Compute the mean face
- Subtract the mean face from each training image to create a set of “difference” images. These difference images represent the deviation of each face from the mean face.
- Compute the covariance matrix of the difference images.
- Compute the eigenvectors and eigenvalues of the covariance matrix.
- Select the “top” eigenvectors, which are the eigenvectors with the largest eigenvalues.
- Use the selected eigenvectors to represent the faces in the dataset.
Expectation Maximization EM
EM is used for finding the maximum likelihood estimates (MLE) of the parameters in a statistical model.
- Initialize the model parameters with some arbitrary values.
- In the E step, compute the expected value of the complete data log likelihood, given the current model parameters. This expected value is called the “Q function” and is a measure of how well the current model fits the data.
- In the M step, maximize the Q function with respect to the model parameters. This step updates the model parameters to better fit the data.
- Repeat steps 2 and 3 until the model parameters converge, or until a maximum number of iterations is reached.
- Return the final model parameters as the maximum likelihood estimates.
Statistics and Machine Learning
Classification based on Logistic Regression Model
regression: from linear to non-linear
Logistc regression: output probability.
multi-class: softmax
Latent Sematic Analysis based on Matrix Factorization
Latent Semantic Analysis LSA aka Latent Semantic Indexing LSI.
term-document matrix, low rank approximation
Linear Discrimination Analysis and Classfication
aka LDA
- a feature dimentionality reduction method
- liner projection into a low dimensional space, which samples of same class - close, different - far from
Deep Learning
Details in Deep Learning and Application.
Reinforcement Learning
Basic Concepts
RL: Learn from interaction with environment.
Agent, policy, state, action, reward.
Feature of RL
- based on assessment
- interaction
- sequential decision-making
Discrete Markov Process
.
Markov chain:
Reward: , gamma is the discount factor.
value function, action-value function
Def of RL problem: Given a MDP, learning a best policy , which maximize .
Bellman Equation
aka Dynamic Programming Equation
Bellman function for value function:
Bellman function for action-value function:
RL based on Value
, then
Policy Assessment methods in RL
Dynamic Programming
Monto-Carlo Sampling
Temporal Difference, aka monto-carlo + DP
- Q-learning
Trade-off between exploration & exploitation: -greedy
- Deep Q-learning. parametrize q function.
Experience Replay
Target Network
RL based on Strategy
parameterize strategy function:
Policy Gradient Theorem
target: maximize
DRL Application
AlphaGo=DL+RL+MCTS
DQN: Atari Game, DOTA2,
Game Theory
Basic Concepts
player, strategy, strategy set, outcome, mixed strategy, pure strategy, payoff, expected payoff, rule
Prisoner’s Dilemma
best response: confess, nash equilibrium
Regret Minimization
Counterfactual Regret Minimization
AI Security
- Encryption Protocol
- Digital watermark
- Data Security (availability, completeness, privacy) and Model security (robustness, correctness, generality)