This presentation is about tools and techniques used in the field of data sciences, data analytics and data engineering. it is a collection of graphics and tabular data for quick learning.
2. Tasks To Do
• Install SQL Server and communicate via some client
• Cloud Deployments
• Understand Linux Architecture and basic commands
• Understand IP Addressing
• Understand Hypervisor
• Understand protocols: DNS, DHCP, HTTP, SSL, TLS, HTTPS, FTP, SMTp
• Master Python & TENSORFLOW
• What are micro-services ? Vs API !
• HashiCorp’s TERRAFORM
• Study of bahria research groups https://bahria.edu.pk/oric/
3. Companies to work in future
• Ublox Lahore https://www.u-
blox.com/en/job-openings
• NETSOL
• TERESOL
• CONTOUR SOFTWARE
https://contour-
software.com/careers/#Jobs
• TERADATA
http://nicat.pk/
4. Active Job Openings
• https://www.u-blox.com/en/job-
openings#Open-jobs
• https://contour-
software.com/careers/#Jobs
5. DS Tools and Requirement
Tool Requirement Tool Requirement Tool Requirement
KAFKA BIG Data Messaging TERRAFORM
Multi-Cloud
management through
code
TENSORFLOW
Low-level software library
created by google to implement
ML models and solve complex
numerical problems
HADOOP BIG Data online storage DOCKER KERAS
High Level Deep Learning API in
Python for easy implementation
and computation of neural
networks
APACHE
SPARC
BIG Data Stream
handling real-time
KUBERNETES PYTORCH
Low-Level API developed by
Facbeook for NLP and computer
vision. More powerful version of
numpy
TABELAU
21. TOP COMPANIES WORKING IN DATA SCIENCE
IN DUBAI
Eurasian
Resources Group -
ERG
Cobblestone Kognitiv
Corporation
Careem First Abu Dhabi
Bank
VISA
DATABUZZ LTD Foodics nybl Constellation
Software, Inc.
The Emirates
Group
TRANSFERWISE
TMC Binance.US Careem Al Futtaim Agility
ARTEFACT MARS Landmark Group Amazon Middle
East and NA
UHRS
RAK BANK Millennium Plaza
Hotel Dubai
Standard
Chartered Bank
Affaan
Technologies
Siemens
DataRobot Arthur Lawrence Parsons
International
Manipal Academy
of Higher
Education, Dubai
GMG
WOW AI LLC APCO Worldwide Accenture BlackSky Swvl
Dataiku Emirates NBD Procter & Gamble Zayed University Mastercard
80. Cluster Computing / Programming
• A computer cluster is a set of computers that work together so that
they can be viewed as a single system. Unlike grid computers,
computer clusters have each node set to perform the same task,
controlled and scheduled by software
88. Processing / Computing requirement is either
- Too large
- Or it takes too long
On standard computers
89. If A task on ON-PREMISE 16 PC Cluster with 4 core processors
each ( = 64 processing nodes) takes 3 months .. then same task
can be done in just 16 hours on 125,000 cores on cloud at
same or no incremental cost ! >>> CLOUD Computing benefits
On-premisis
cluster
149. The most widely-used
engine for scalable
computing
Thousands of companies,
including 80% of the
Fortune 500, use Apache
Spark™.
Over 2,000 contributors to
the open source project
from industry and
academia.
150.
151.
152.
153.
154.
155.
156.
157.
158.
159. HADOOP is batch processing only
But SPARK is real time processing also. !!
189. What is KAFKA – explained again
• A messaging system
• Simplifies management of data pipelines
• Retain messages even when there is issue in a pipeline due to
network issue
• Any sink of message system can subcribe to data pipeline
• Queue and public subscribe Model
190.
191.
192.
193.
194.
195.
196.
197.
198.
199.
200.
201.
202.
203.
204.
205.
206. (1) What is terraform in Hindi/Urdu | Lec-01 | Terraform
tutorial for beginners | Infrastructure as Code - YouTube