SOM Self-Organising Maps
Sure! Self-Organizing Maps (SOM), also known as Kohonen maps, are a type of artificial neural network that is commonly used for unsupervised learning and data visualization. They were developed by Finnish professor Teuvo Kohonen in the 1980s.
The basic idea behind SOM is to map high-dimensional input data onto a lower-dimensional grid or lattice, while preserving the topological relationships and structure of the input data. The grid typically consists of nodes or neurons, and each node represents a prototype or codebook vector that captures the characteristics of a particular region of the input space.
Here's how the SOM algorithm works:
- Initialization: First, the grid is initialized with random values. Each node in the grid is associated with a weight vector of the same dimension as the input data.
- Input data presentation: A training example from the input data is randomly selected and presented to the SOM.
- Neuron activation: The distance between the input data and the weight vectors of all the nodes in the grid is calculated. The node with the weight vector closest to the input data is referred to as the "best matching unit" (BMU) or "winning node." The BMU is determined based on a distance metric such as Euclidean distance or cosine similarity.
- Neighborhood update: The weights of the BMU and its neighboring nodes are updated to make them more similar to the input data. The degree of update decreases as the distance from the BMU increases. This process helps in preserving the topological relationships of the input data.
- Learning rate adjustment: As the training progresses, the learning rate is gradually decreased to ensure convergence. This means that the magnitude of weight updates decreases over time.
- Repeat steps 2-5: Steps 2 to 5 are repeated for a fixed number of iterations or until convergence criteria are met. The training process allows the SOM to gradually adjust its weights to represent the underlying structure and distribution of the input data.
Once the SOM training is complete, the grid of neurons represents a low-dimensional representation of the high-dimensional input data. The nodes that are close to each other on the grid tend to have similar weight vectors, indicating that they represent similar regions in the input space. This property of SOMs makes them useful for data visualization and clustering tasks.
SOMs have been widely used in various applications, including image processing, pattern recognition, data mining, and exploratory data analysis. They provide a powerful tool for visualizing complex data and discovering hidden patterns and structures within the data.