1. INDUSTRIAL
INTERNSHIP REPORT
MACHINE LEARNING INTERN
Industrial Internship Report submitted to TEAM ECKOVATION
towards partial fulfillment of INTERNSHIP.
SUBMITTED BY-
SHIKHARSRIVASTAVA
8/17/2018
INTERNSHIP ID :380
ECKOVATION
Submitted to: TEAM
ECKOVATION
2. ACKNOWLEDGEMENT
I express my thank and gratitude to Eckovation team who
helped me a lot in my internship and learning.
I am extremely thankful to people who cleared my doubts
and helped me providing tools and modules in internship.
This was my Machine Learning Internship where I
learned Machine Learning applications, Algorithms,
Implementations and its understanding.
It was a great experience of mine where implemented a
project whose title is “teach a neural network that
recognize handwritten text”.
3. Company Profile
Eckovation is a fast growing Organization fully governed
by young and energetic Technocrats dedicated to
technology and education development. They offer many
courses over to which they provide internships.
Eckovation people are passionate about connecting
students, educators, peers, corporations and workers to
enable dissemination of knowledge and information and
thereby making quality education and skill development
accessible for students and workers across the country.
Through their application, they hope to improve the lives
of millions of students who have the desire and will to
succeed but do not get access to quality education and are
therefore trapped in a vicious cycle of lack of education,
inability to come out of poverty and, descent into crime.
Their platform is a confluence point of various ideas by
all the stakeholders involved in the teaching and
attainment of knowledge process.
They provide wonderful support when student is interning
with, quite helpful and best place to learn.
4. TECHNOLOGY
MACHINE LEARNING
Machine learning is a subset of artificial intelligence in
the field of computer science that often uses statistical
techniques to give computers the ability to "learn" (i.e.,
progressively improve performance on a specific task)
with data, without being explicitly programmed.
Machine learning (ML) is a category of algorithm that
allows software applications to become more accurate in
predicting outcomes without being explicitly programmed.
The basic premise of machine learning is to build
algorithms that can receive input data and use statistical
analysis to predict an output while updating outputs as
new data becomes available.
5. The processes involved in machine learning are similar to
that of data mining and predictive modeling. Both require
searching through data to look for patterns and adjusting
program actions accordingly. Many people are familiar
with machine learning from shopping on the internet and
being served ads related to their purchase. This happens
because recommendation engines use machine learning to
personalize online ad delivery in almost real time. Beyond
personalized marketing, other common machine learning
use cases include fraud detection, spam filtering, network
security threat detection, predictive maintenance and
building news feeds.
HOW MACHINE LEARNING WORKS
6. Machine learning algorithms are often categorized as
supervised or unsupervised. Supervised algorithms
require a data scientist or data analyst with machine
learning skills to provide both input and desired output, in
addition to furnishing feedback about the accuracy of
predictions during algorithm training. Data scientists
determine which variables, or features, the model should
analyze and use to develop predictions. Once training is
complete, the algorithm will apply what was learned to
new data.
Unsupervised algorithms do not need to be trained with
desired outcome data. Instead, they use an iterative
approach called deep learning to review data and arrive at
conclusions. Unsupervised learning algorithms -- also
called neural networks -- are used for more complex
processing tasks than supervised learning systems,
including image recognition, speech-to-text and natural
language generation. These neural networks work by
combing through millions of examples of training data
and automatically identifying often subtle correlations
between many variables. Once trained, the algorithm can
use its bank of associations to interpret new data. These
algorithms have only become feasible in the age of big
data, as they require massive amounts of training data.
7. EXAMPLES OF MACHINE LEARNING
Machine learning is being used in a wide range of
applications today. One of the most well-known examples
is Facebook's News Feed. The News Feed uses machine
learning to personalize each member's feed. If a member
frequently stops scrolling to read or like a particular
friend's posts, the News Feed will start to show more of
that friend's activity earlier in the feed. Behind the scenes,
the software is simply using statistical analysis and
predictive analytics to identify patterns in the user's data
and use those patterns to populate the News Feed. Should
the member no longer stop to read, like or comment on
the friend's posts, that new data will be included in the
data set and the News Feed will adjust accordingly.
Machine learning is also entering an array of enterprise
applications. Customer relationship management (CRM)
systems use learning models to analyze email and prompt
sales team members to respond to the most important
messages first. More advanced systems can even
effective responses. Businessrecommend potentially
intelligence (BI) and analytics vendors use machine
learning in their software to help users automatically
identify potentially important data points. Human
resource (HR) systems use learning models to identify
8. characteristics of effective employees and rely on this
knowledge to find the best applicants for open positions.
Machine learning also plays an important role in self-
driving cars. Deep learning neural networks are used to
identify objects and determine optimal actions for safely
steering a vehicle down the road.
DIFFERENCE BETWEEN MACHINE LEARNING
AND DEEP LEARNING
9. TYPES OF MACHINE LEARNING ALGORITHMS
Just as there are nearly limitless uses of machine learning,
there is no shortage of machine learning algorithms. They
range from the fairly simple to the highly complex. Here
are a few of the most commonly used models:
This class of machine learning algorithm involves
identifying a correlation -- generally between two
variables -- and using that correlation to make
predictions about future data points.
Decision trees. These models use observations about
certain actions and identify an optimal path for
arriving at a desired outcome.
K-means clustering. This model groups a specified
number of data points into a specific number of
groupings based on like characteristics.
Neural networks. These deep learning models utilize
large amounts of training data to identify correlations
between many variables to learn to process incoming
data in the future.
Reinforcement learning. This area of deep learning
involves models iterating over many attempts to
complete a process. Steps that produce favorable
10. outcomes are rewarded and steps that produce
undesired outcomes are penalized until the algorithm
learns the optimal process.
Classification techniques
It predict discrete responses—for example, whether an
email is genuine or spam, or whether a tumor is cancerous
or benign. Classification models classify input data into
categories. Typical applications include medical imaging,
speech recognition, and credit scoring.
Use classification if your data can be tagged, categorized,
or separated into specific groups or classes. For example,
applications for hand-writing recognition use
classification to recognize letters and numbers. In image
processing and computer vision, unsupervised pattern
recognition techniques are used for object detection and
image segmentation.
Common
include
algorithms for
support
performing
vector
classification
machine
(SVM), boosted and bagged decision trees, k-nearest
11. neighbor , Naïve Bayes, discriminant analysis, logistic
regression, and neural networks.
Regression techniques
It predict continuous responses—for example, changes in
temperature or fluctuations in power demand. Typical
applications include electricity load forecasting and
algorithmic trading.
Use regression techniques if you are working with a data
range or if the nature of your response is a real number,
such as temperature or the time until failure for a piece of
equipment.
Common regression algorithms include
model, nonlinear model, regularization,
linear
stepwise
regression, boosted and bagged decision trees, neural
networks, and adaptive neuro-fuzzy learning.
13. MY INTERNSHIP PROJECT
TITLE : Teach neural network for Handwritten Digit
Recognition using Machine Learning using MNIST dataset .
GITHUB PROJECT LINK : https://github.com/Shikhar0609/Teach-a-Neural-
Network-to-Read-Handwriting
I can you any of the algorithms like k nearest neighbor,
random forest classifier, support vector machine for this
project but I used “RANDOM FOREST CLASSIFIER” for this.
Random Forest Classifier
Random forest classifier creates a set of decision trees
from randomly selected subset of training set. It then
aggregates the votes from different decision trees to
decide the final class of the test object.
Suppose training set is given as : [X1, X2, X3, X4] with
corresponding labels as [L1, L2, L3, L4], random forest
may create three decision trees taking input of subset for
example,
1. [X1, X2,X3]
2. [X1, X2,X4]
3. [X2, X3,X4]
14. So finally, it predicts based on the majority of votes from
each of the decision trees made.
“ Random forests or random decision forests are
an ensemble learning method
for classification, regression and other tasks, that operate
by constructing a multitude of decision trees at training
time and outputting the class that is the mode of the
classes (classification) or mean prediction (regression) of
the individual trees. Random decision forests correct for
decision trees' habit of overfitting to their training set.”
15. Requirements
Python 3.5 +
Scikit-Learn (latest version)
Numpy (+ mkl for Windows)
Matplotlib.
Overview of MNIST Dataset
MNIST dataset is a subset of large dataset NSIT.
The MNIST database contains 60,000 training
images and 10,000 testing images.
TRAINING TIME
Compute feature vector for positive and negative
examples of image patches.
Train classifier.
Test Time
Compute feature vector for image classification.
Evaluate classifier.
18. CONCLUSION
I completed my internship my completing this project. I
declare this is the above mentioned report is mine and all
data and thing are of my resource and study.
Internship was really learning. I got implement Machine
Learning using python and create above stuff using it.
19. DECLARATIONS
I SHIKHAR SRIVASTAVA INTERN OF MACHINE LEARNING
DECLARE HERE THAT THIS WORK IS OF MY OWN AND I HAVE
CREATED IT MYSELF.
NAME: SHIKHAR SRIVASTAVA
PLACE: ONLINE INTERNSHIP
DATE: 17 AUGUST 2018