Important considerations while working with a Machine Learning Algorithm
There are many factors which influence your accuracy percentages when you are trying to build a solution using machine learning. One of the most important of such factors is the machine learning algorithm you choose. In this machine learning tutorial we shall look briefly into important considerations while working with a Machine Learning Algorithm
Following are the important aspects of a machine learning algorithm to be considered keeping in mind the features you have chosen, categories that you have, correlation between the features and many more of such :
- What does this machine learning algorithm do ?
- What kind of learning does this algorithm do ?
- What are the assumptions made about features or training data by this algorithm ?
- What is the complexity in regard to computations performed by the algorithm ?
- What are the memory requirements of this algorithm ?
- What are the mathematics that this algorithm is going to use ?
- What is the cost function ?
- What is the cost reduction technique used by the algorithm ?
- What are the advantages of this algorithm over related algorithms ?
- What are the disadvantages imposed by the assumptions made by this algorithm ?
- Based on the above questions, what are the use cases that it can have in its implementation scope ?
- What are the list of other algorithms available that can used in alternative to the algorithm ?
There are many real world problems like spam detection, document classification, entity classification, weather prediction, traffic prediction, fraud detection, online purchase recommendations based on searching history, personalize user’s online retail profile, suggesting diagnostics based on symptoms, natural language processing, financial trading and many many more.
An algorithm in machine learning could solve one or more problems stated above or could help another machine learning algorithm in solving the problem.
Understand your problem and identify the area under which it comes. For example if the problem is regarding classification, or predicting a future course of process based on previous observations (which is regression). For each kind of machine learning problem, there are many machine learning algorithms from which one can be chosen. Analysis of features and categories help in choosing a few from the lot. Answers to the following queries will help you in the narrow down of algorithms.
A machine learning algorithm learns the trends or rules or condition from training data available. Or it can always make use of all the training data available, each time when it tries to predict some useful information for experiment or event or problem instance.
The mechanism or technique in an algorithm always makes some assumptions. These assumptions could be on the number of training experiments or observations available, the relation between the features considered, or limit on number of categories from which it has to choose, the type of noise in feature values, features being linear or non-linear,features’ values being continuous or discrete, and many more.
Complexity is the measure which is proportional to the number of processing steps required or computational power required by an algorithm to make a prediction, or generate a model during training. This could help you in deciding the processing power you might require.
Based on the way in which a machine learning algorithm learns from the data, or how it predicts some useful information for an experiment or observation, or how it models the training data decides the physical memory required for the algorithm to run successfully before it could go out of memory. If you are using any of the established tools like Google Tensor Flow or Deeplearning4j or such, the processes would be optimized to use the memory carefully.
A machine learning algorithm could use probabilistic or statistical or curve fitting or regression or progression or gradients or some other mathematical technique to solve the problem it has chosen. It is very important to understand or get an idea of how it takes help of these mathematical techniques. In course, these mathematical techniques an algorithm chooses, dictate the complexity of the algorithm, some of the assumptions an algorithm considers, memory requirements of the algorithm and such.
A machine learning algorithm tries to fit the training data to a model with reduced cost in prediction. The cost function defines the method to calculate amount of deviation of model’s prediction from the actual facts.
During a machine learning algorithm fits training data to a model, it tries to reduce the cost paid in accuracy. A machine learning algorithm comprises of a technique (like gradient descent) which helps in achieving this goal of reduced cost.
There is huge list of machine learning algorithms out there and possibly many solutions for a given problem. Usually more than one algorithm could solve a given problem. But the challenge is picking the algorithm that has better accuracy than others, and has the assumptions that are in alignment with the training data and problem information. Justifying the choice made in terms of machine learning algorithm should consist of the advantages of this algorithm over other related algorithms. Trying out atleast three or four or more algorithms for a problem and choosing the one with better accuracy is always a good approach.
There is always a trade of to be made in choosing a machine learning algorithm. Usually the assumptions, computational cost and such factors pose limitations. If the assumptions deviate even slightly from the actual training data or features or such, the accuracy may reduce which is a disadvantage. There factors which affect the accuracy should be taken care and the machine learning programmer should always be aware of the consequences when deviation from assumptions happen.
Based on all the factors considered above, there could be only section of problems that this machine learning algorithm could solve or be preferred.
As we already know, there is a huge list of machine learning algorithms. For a given problem that belong to a class of problems, the problem could be solved by any of a group of algorithms. And it is always good that one knows about most of the possible algorithms suitable for solving a particular class of problems, so that the computational power available or resources could be exploited for better accuracy numbers.
This concludes our topic on Important considerations while working with a Machine Learning Algorithm.