Neural Network Basics

Replenishment date: 11.02.2024

Contents: Basics of neural networks.pdf (269.12 KB)
️Automatic issue of goods ✔️

Sales:

Refunds:

Reviews:

Views:

price:

2,17 $

Buy Add to Cart

с «Rules for the purchase of goods» read and agree

make money on this product

Seller

Seller:

alevtina_sar

Rating:

3,2

all seller goods

Ask a Question

Report a violation

Description

1. Neural networks come in the following types:

*Fully connected and recurrent

*Recurrent, convolutional and transformers

*Recurrent, convolutional, fully connected and transformers

2. The classification task is a task

*Learning with a teacher

*Learning without a teacher

*Reinforcement learning

3. Training a neural network is the application of an optimization algorithm to solve a problem

*Minimization of the average norm of the empirical risk gradient by model weights

*Minimize empirical risk

*Minimizing the average norm of model weight matrices

4. Check the correct statements about activation functions:

*The sigmoid activation function lies in the range [0,1] and can be interpreted as a probability, and therefore is often used to solve binary classification problems. ReLU function - piecewise linear

*The Leacky ReLU function is differentiable everywhere. The popular hyperbolic tangent activation function can be used as a solver function for a regression problem. The derivative of the sigmoidal function is not expressed analytically through the value of the function itself at a given point

* All activation functions are interchangeable due to the fact that they have the same range of values and domain of definition

5. The idea of Momentum is:

*Calculating the gradient at the point to which the algorithm should converge at the current step, according to the calculated moment term, and not at the point from which the algorithm takes the step

*Using the idea of physical inertia by adding moment members, “velocities”

*approximate, and therefore faster (“instant”) calculation of gradients in the current

6. Neural networks most often used in CV are

*Fully connected

*Convolutional

*Recurrent

7. The machine learning problem can be represented as a sequence of actions to select the optimal decision function f from the multi-parameter family F. The learning problem is reduced to an optimization problem at the stage:

*F family selection

*Quality ratings for a selected function f from the F family

*Searching for the best function from family F

8. The derivative of the sigmoid is expressed analytically through the sigmoid itself, as

*sigm' = sigm(1 - sigm)

*sigm' = 5sigm^(5)

*sigm' = 100sigm/sin(sigm)

9. Method for selecting an adapted learning rate based on the assessment of historical gradients:

*Nesterov Momentum

*RMSProp

*Adagrad

10. When passing directly through the Feed Forward Neural Network:

*The model weights are updated based on the gradients calculated in the previous iteration

*The architecture of the model is built by selecting the number of layers and their sizes

*The signal is transmitted through sequential matrix multiplication and the application of non-linear activation functions

11. The architecture of fully connected neural networks is based on the idea

* generalization of low-level features and generation of higher-level ones based on them

*Construction of a separating hyperplane

*Minimizing the loss function without using gradient methods

12. Initial initialization of neural network weights:

*Must be constant so that the results of training a neural network on the same training set are reproducible

*Must be random so that the model can learn without zeroing out the gradients at a certain step, and such that the signal dispersion will not change when passing through the layers of the neural network.

*Can be any

13. The best way to combat overtraining:

*Change of model architecture

*Regularization

*Increasing amount of data

14. The currently most popular optimization method, based on the idea of using two moment terms, proposed in 2015:

*ADAM

*Adagrad

*Adadelta

15. Tutored learning is characterized by

*The goal is to train the agent to make optimal decisions in the environment

* Lack of labeled sampling

*The presence of a labeled sample

16. Gradient optimization methods

*Represent iterative algorithms

*Analytically search for a solution to the optimization problem

*Contrary to the name, they do not use gradients

17. The Karush-Kuhn-Tucker conditions are applicable to solve:

*Any task optimization

Additional Information

*Any optimization problem

*Convex optimization problems

*Optimization problems for an arbitrary function on a convex set Q

18. All the algorithms described in the lecture have a common property. Which one?

*All require calculation of the Hessian matrix of the optimized function

*All require calculation of gradients of the optimized function

*All require calculation of the value of the optimized function at a given point

19. Activation functions in neural networks:

*Nonlinear (globally) and introduce heterogeneity into the signal during direct passage

*Linear and needed to check the functionality of the model

*Activate the neural network in different operating modes

20. Retraining is an effect that occurs when

*Excessive complexity of the model in relation to the complexity of the training set, which causes “learning” of the data

*Training the model for too long, causing it to lose its predictive ability due to an increase in the entropy of the weights

*Fatigue of a machine learning specialist because his models take too long to learn

21. Backpropagation algorithm:

*Consists of randomly selecting model weights until an optimal set of parameters is achieved that minimizes the error

*Used only for optimization of fully connected neural networks

*Sequential calculation of gradients by model weights, starting from the last layer, by pre-activations of the corresponding layer and gradients by the weights of the next one

22. Activation functions in neural networks:

*Nonlinear (globally) and introduce heterogeneity into the signal during direct passage

*Linear and needed to check the functionality of the model

*Activate the neural network in different operating modes