30

Classification Model in Machine Learning

 3 years ago
source link: https://matlabhelper.com/blog/matlab/classification-model-in-machine-learning/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Classification Model in Machine Learning
Need Urgent Help?

Our experts assist in all MATLAB & Simulink fields with communication options from live sessions to offline work.

testimonials

Philippa E. / PhD Fellow

I STRONGLY recommend MATLAB Helper to EVERYONE interested in doing a successful project & research work! MATLAB Helper has completely surpassed my expectations. Just book their service and forget all your worries.

Yogesh Mangal / Graduate Trainee

MATLAB Helper provide training and internship in MATLAB. It covers many topics of MATLAB. I have received my training from MATLAB Helper with the best experience. It also provide many webinar which is helpful to learning in MATLAB.

Read another post

These days the term Machine Learning is used in almost every field and is said to be the future of technology. Classification is one of these most important aspects of where machine learning is used. These buzz words are not as tricky as they seem to be. Interested in knowing what it is about and how MATLAB supports this concept? In this blog, we shall see how we can learn the basics of practical machine learning methods for classification problems. For those looking for a step by step instruction on creating a classification model right from importing data to calculating accuracy, this would be the right place. We shall see how to do this with a task to distinguish the letters "J", "V" and "M" and classify them into their respective group.

Data Used:

Different people were asked to write the letters "J", "V" and "M" on a tablet, and the handwritten letters were stored as individual text files. Each file contains four columns, each separated by a comma. The four columns are a timestamp, the horizontal location of the pen, the pen's vertical location, and the pressure of the pen. The timestamp is the number of milliseconds elapsed since the beginning of the data collection. The other variables are in normalized units, i.e., from 0 to 1. The main folder has three sub-folders, one for each J, V, and M.

Import data:

We aim to create a model to classify an image as either letter J or V or M. Our first step towards this is importing the Handwriting data into MATLAB.

You can use the readtable function to import the tabular data from a spreadsheet or text file and store the result as a table.

letter=readtable("J.txt");

This imports the data from the text file J.txt and stores it in a table called letter.

Our next step will be to extract the variable X from the table and the variable Y. You can use dot notation to get the column separately from a table. Then, use the plot function to plot them. 

plot(letter.X ,letter.Y)

You can use the axis equal command to force the axes to preserve the data's aspect ratio. This will help in getting a more evident plot of the letter.

axis equal

This is the output after the command which shows the letter J.

Repeat the same importing and plotting tasks for the data in the file M.txt and V.txt. 

Process data:

Correcting Units

The pen positions for the handwriting data are not measured in normalized units. Also, the tablet used to record the data is not square. This means a vertical distance of 1 corresponds to 10 inches, while the same horizontal distance corresponds to 15 inches. To correct this, the horizontal units should be adjusted to the range [0 1.5]. 

You can use dot notation to extract, modify, and reassign variables in a table, just as you would with any variable. Multiply the values in the X variable of the table letter by the aspect ratio of 1.5.

letter.X = 1.5*letter.X;

The first picture is before changing the aspect ratio and the second picture is after changing the ratio during the pre-processing of data.

Shift the table letter's Time variable to start at 0 by subtracting the first value from all elements. Divide the result by 1000 to convert to seconds. 

letter.Time = (letter.Time - letter.Time(1))/1000;

Extract features:

Calculating features:

What property of these letters can we use to distinguish a J from an M or a V? This property that we will find is different for each letter is called a feature. A feature is simply a value calculated from the signal, such as its duration.

  1. For the letters J and M, a simple feature might be the aspect ratio (the height of the letter relative to the width). J is likely to be taller and will have less width, whereas an M is likely to be square.
  1. Compared to J and M, a V is quick to write, so the signal's duration might also be a feature.

Calculate the time taken to write the letter by extracting the last value of the letter. Time and storing the result in a variable called dur.

dur = letter.Time(end)

Use the range function to calculate the letter's aspect ratio by dividing the range of values of letter.Y by the range of values of letter.X. Assign the result to a variable called aratio.

The range function returns the range of values in an array. That is, range(x) is equal to max(x)-min(x). 

aratio = range(letter.Y)/range(letter.X)

Change the file name and rerun the script to calculate the same two features for the letters in J.txt and V.txt.

Output

View Features

The MAT-file featuredata.mat contains a table of the extracted features for these three letters written by a variety of people. The table "features" has three variables: AspectRatio and Duration (the two features calculated in the previous section), and Character (the known letter).

The gscatter function makes a grouped scatter plot; a scatter plot where the points are coloured according to a grouping variable. Use the gscatter function to create the same as a scatter plot but coloured according to the letter, stored in the Character variable of the table features.

MH Quiz Contest - Dec'20

Online MCQ Quiz on Machine Learning & MATLAB

Special Prize applicable till 31st December 2020. Hurry up!

Build a model:

What is a classification model?

In machine learning, classification refers to a predictive modelling problem where a class label is predicted for a given example of input data. Each region is assigned one of the output classes.

There is no single absolute "correct" way to partition the plane into the classes J, M, and V. Different classification algorithms result in different partitions.

An easy way to classify an observation is to use the same class as the nearest known examples. This is called a k-nearest neighbor(kNN) model. The kNN model works on the principle same class points are together. It would calculate the distance between 2 points in a graph, and if the distance is high, it will classify it as a different group.

Use the fitcknn function to fit a model to the data stored in features. The known classes are stored in the variable called Character. Store the resulting model in a variable called knnmodel.

knnmodel = fitcknn(features,"Character")

Output

Having built a model from the data, you can use it to classify new observations. Now, this new data is the test data used to test the classification model. The testdata contains observations for which the correct class is known. This gives a way to test your model by comparing the classes predicted by the model with the true classes. The predict function will ignore the Character variable when making predictions from the model. Use the predict function with the trained model knnmodel to classify the letters in the table testdata. Store the predictions in a variable called predictions. 

predictions = predict(knnmodel,testdata);

predictions=cell2mat(predictions)

Output

By default, fitcknn fits a kNN model with k = 1. That is, the model uses just the single closest known example to classify a given observation. You can make the model more accurate in the testing data by increasing the value of k. 

Evaluate the model

How can we know how efficient is the kNN model, and how good is this model in classification? The table testdata includes the known class for the test observations. You can compare the known classes to the kNN model's predictions to see how well the model performs on new data.

Use the == operator to compare predictions to the known classes (stored in the variable Character in the table testdata). Store the result in a variable called iscorrect. 

iscorrect=predictions==char(testdata.Character);

iscorrect=iscorrect(:,2)

Output

Calculate the proportion of correct predictions by dividing the number of correct predictions by the total number of predictions. You can use the sum function to determine the number of correct predictions and the numel function to determine the total number of predictions.

accuracy = sum(iscorrect)*100/20

Output:

accuracy =

Applications

With these instructions on creating a model, it is effortless to perform classification tasks in MATLAB, and they can also be extended to various other real-life applications such as:

  1. Medical analysis: Classify the severity of a disease using the scans
  2. Factories: Classify the defects on a final product
  3. Traffic: Classify the vehicles on a road as 2-wheeler or 4-wheeler, etc
  4. Banks: Classify the people who will be able to repay the loan and who won't

Loved the Blog? Gives us your valuable feedback through comments!

Thank you for reading this blog. Do share this blog, if you found it helpful. If you have any query, post it in the comments or get in touch with us by emailing your questions to [email protected]. Follow us on LinkedInFacebook and Subscribe to our YouTube Channel. We have expanded the traditional classroom teaching to meet the needs of today's learners. Our experts assist in all MATLAB & Simulink fields with communication options from live sessions to offline work with Pricing suitable for everyone. You can get offline help via email or opt for online zoom meetings with one-click content sharing, real-time co-annotation, and digital whiteboard.

If you are looking for one-time expert help, you can go ahead with Pay As You Go Plan. If your task is research-oriented like thesis support or paper implementation and you have a proper timeline, our recommendation would be Research Assistance, a monthly plan with a steady reduction of 10% of the expert fee up to six months of subscription. We also offer Corporate Assistance for requirements with annual validity. The minimum expert booking time is 1 hour under the Pay As You Go plan. You can book 5/10/20 hours under the Research Assistance plan. You will get expert help for the time you book only after you have an active order.

If you are looking for an expert's help and ready for the paid service, share your requirement with necessary attachments & inform us about any Service preference along with the timeline. Once evaluated, we will revert back to you with more details and the next suggested step.

Education is our future. MATLAB is our feature. Happy MATLABing!


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK