Leonid sheremetov

Awareness Raising Workshop, Mexico-city Nov. 23-24

High Performance Computing in
Petroleum Exploration and Production:
IMP experience

Dr. Leonid Sheremetov
sher@imp.mx
Mexican Petroleum Institute

Outline

HPC challenges in the Petroleum Industry
Mexican Petroleum Institute (IMP)
IMP profile
Research Program for Applied Mathematics and
Computing (MAyC)
High performance computing in IMP
Research agenda of HPC in MAyC
Grid-based Simulation (dynamic data driven
applications)
Grid-based Distributed Data Mining
Task assignment in desktop grids
Adaptive grain parallelism
Dynamic task distribution in multi-core clusters
Conclusions

RISC, Nov. 23-24 Mexico-city

O&G Exploration and Production in Mexico
and in the world
PEMEX – Mexican Oil Company:
4 regions,
14 assets,
2488 oil fields,
24645 wells
PEMEX Technology Strategy: technical
innovation and advanced decision
making support
Research funds: CONACYT-SENER
Hidrocarburos and Energías
Renovables

Principle reservoirs decline (decreased
recovery)
Increased technical complexity of all
processes (increased cost)
Increased gap between acquired and
utilized data
RISC, Nov. 23-24 Mexico-city 3

O&G industry

Transparent
Data and Shared
Information Transactional Processes
Remote Operations Virtual Processes
Collaboration

Immersion
Technologies

Knowledge
Management
Integrated
Operation Supply Chains
Optimization

Operational and
Financial Reports

Real Time Information
Reservoir modeling about Reservoirs


O&G industry (continued)

3-D Seismic/simulation Multiple SCADA
PEMEX Corporate DB
systems

ADITEP
HF Data
Historian

SIOPDV HF Data
Historian

SAP

computationally intensive tasks data intensive applications sensor intensive applications
(i-Field)
Solving grand challenge applications using HPC

Mexican Petroleum Institute (IMP)
• IMP is public research centre
• IMP was founded on August, 23
of 1965
• Year budget about $300 mln USD
• IMP objectives:
• Research and Development
• Application Technologies for PEMEX – Mexican Oil Company
• Consulting
• Education and Training
(postgraduate program
opened in 2003)


Research Program for Applied Mathematics and
Computing

Founded in 2001
Contains:
Researchers
Developers
Scientific Computing Lab
Main Research Areas:
Distributed Intelligent
Computing
• data mining
• computational intelligence
• expert systems
• agent technology
Multiobjective Optimization
• logistics
• supply chain management
Simulation
• partial differencial
equations
• numeric methods


Supercomputing in Mexico


HPC in Mexico
CINVESTAV: Xiuhcóatl
Processors: INTEL-AMD-GPGPU
Number of cores: 3480 (CPU),
Real performance: 24.97TFlops
UAM: AITZALOA
Number of nodes: 270 (135 Twin) nodes.
Processors: Intel Xeón Quad-Core a 3 Ghz
Number of cores: 2160 (540 Quad-Core
CPU)
Memory: 16GB en RAM por nodo.
Real performance: 18.4 TFlops.
UNAM: KAN BALAM (HP CP 4000)
Number of nodes: 342 nodes,
Processors: AMD Opteron
Number of cores: 1368 CPU
Memory: 3 Terabytes.
Real performance: 7.1 TFlops

Fujitsu K computer, SPARC64 VIIIfx 2.0GHz,
Tofu interconnect: 705,024 cores, 10,510 TFlops

Mexico and IMP in Top500

MAyC created

Evolution of High Performance Platforms in the IMP

1968: IBM1130
1972: IBM-360/44 for the
Computer Centre and Centre for
Geophysical Processing (analysis
of seismic data for reservoir
characterization)
1980: UNIVAC 1106 (design of oil
platforms)
1982: UNIVAC 1100/82
(multiprocessor), VAX 750
1982: 1st distributed DB in Mexico
2000: Cray Origin 2000
2001: Research Program on
Applied Mathematics and
Computing (PIMAyC).
2001: Lufac Cluster with 256
nodes (2 CPUs each)
2009: Lufac Cluster (Villahermosa)
2011: Supercomputing Lab: Xeon
X7500 CPU, 250 cores - in
progress
Estimated server&cluster capacity (2011): 0.4 TFlops


Applications of HPC in the IMP
Computation intensive tasks
Reservoir simulation
Oceanographic modeling for offshore exploration
Atmospheric modeling
Data intensive tasks
3-D seismic cubes pre-stack analysis and multi-
attribute analysis
Data mining
Computation and communication intensive
tasks:
Collaborative engineering
Nano characterization in 2D and 3D and
nanochemical analysis (shared Lab.)


Improved exploration and production (E&P)
performance
HPC seismic-to-simulation technologies
seamlessly integrate geophysics, geology, and
reservoir engineering in a unified earth model
Schlumberger’s Petrel™ Seismic Server
analyzes terabytes of seismic survey data
represented in 2-D and 3-D displays using
Petrel Geophysics.
ECLIPSE® reservoir simulation software uses
the power of HPC clusters to generate
animated 3-D simulation models
Schlumberger’s software is optimized for
Intel’s Xeon multi-core architecture working on
advanced compiler and communication
technology, such as the Intel® MPI Library 3.1
and the Intel® Compiler Suite, for high-
performance cluster software available on the
Microsoft® Windows® Compute Cluster
Server

Real Time Remote Control of a JEOL JEM 2200
FS Microscope Using Internet 2

IMP Ultra High Resolution
Electron Microscopy Laboratory
is one of the first shared Labs in
Mexico promoting in
collaboration with the UNAM
Institute of Physics the creation
of national and international
networks on multidisciplinary
scientific research, sharing
technologic infrastructures
through internet 2.
It provides nano characterization
in 2D and 3D and nanochemical
analysis
Both computational and
communication (12 Mb I2)
intensive tasks
Head of Lab. Vicente Garibay
Febles, vgaribay@imp.mx


High resolution image of Pd-catalyst
nanoparticles and its chemical analysis (EDS).

.

Project CONACYT SENER-Hidrocarburos:
Data Mining

Methods and Techniques of
Computational Intelligence and
Data Mining for Decision
Making in Exploitation of Mature
Fields
Project coordinator:
Instituto Mexicano del Petróleo
(MAyC)
Project collaborators:
CINVESTAV, CIC-IPN, CIMAT,
IIE, INAOE,.
Project dates: Scatterplot of multiple variables against Fecha
JUJO-2A in PozosReconstruidosAforos-HistóricosProducción.stw 3v*9132c
Prod. Aforos = Distance Weighted Least Squares
Prod. Diaria Prom = Distance Weighted Least Squares

March 08, 2011 – March 07, 2013 20000

18000

16000

14000

12000

10000

8000

6000

4000

2000

0

-2000

28/08/1976 18/02/1982 11/08/1987 31/01/1993

Fecha
24/07/1998 14/01/2004 06/07/2009
Prod. Aforos
Prod. Diaria Prom 17

Project CONACYT SENER-Hidrocarburos:
Data Mining

Objective: Develop and apply data mining
and computational intelligence (DM&CI)
techniques for the análysis of technical data
on hidrocarbon exploitation to support decision
making and solution identification increasing
the efficiency of exploitation of mature fields
Novel approach: top-down (inverse)
modeling based on the analysis of dynamic
oilfield data and reconstruction of the static
characterization and hydro-geological
reservoir models for selection of poorly
drained areas and recovery methods applying
DM&CI
Data: one oilfield – 9,464 files, > 50Gb
(without seismic and simulation models)
(2488 oil fields)

What grids would we need?

Data grid:
Support for large, distributed data repositories
Computational grid:
Execution of high-end simulation models in parallel
and distributed fashion
Knowledge grid:
Add basic knowledge discovery mechanisms to a grid
A grid architecture specialized for data mining


Grid-based Simulation: Dynamic Data Driven Application
Systems

Formalized by Frederica Darema
Data is fed into an executing application
either as the data is collected or from a
data archive.
The simulation can then make
predictions about the entity regarding
how it will change and what its future
state will be. The simulation is then
continuously adjusted with data gathered
from the entity. The predictions made by
the simulation can then influence how
and where future data will be gathered
from the entity, in order to focus on areas
of uncertainty.
Production history data can be fed to the
reservoir simulator to determine the
reservoir description parameters from the
given performance and to predict the
performance of an oil field.
Intelligent agents are suitable to make
these decisions with regard to which data
to absorb, when it should be absorbed,
and how it should be absorbed.


Distributed Data Mining on Knowledge Grids

TeraGrid (San Diego Supercomputer
Center, National Center for
Supercomputing, Caltech, Argonne
National Lab: scientific data sets mining)
Knowledge Grid (Università di
Catanzaro and DEIS, Università della
Calabria running over MIUR SP3 Italian
national grid)
Terra Wide Data Mining Testbed
(National Center for Data Mining at the
University of Illinois at Chicago)
ADaM (University of Alabama in
Huntsville: hydrology data mining)

IMP&PEMEX - Data mining algorithms
and knowledge discovery processes are
both compute and data intensive,
therefore the Grid can offer a computing
and data management infrastructure for
supporting decentralized and parallel Adapted from: M. Cannataro, A. Congiusta, A. Pugliese, D. Talia, and
data analysis. P. Trunfio. Distributed data mining on grids: Services, tools, and applications.
IEEE Transactions on Systems, Man, Cybernetics, Part B, 34(6), 2004.


Distributed Data Mining for Modelling of Hydraulic
Communication between Wells

Principal components analysis
Fuzzy clustering (fuzzy K-means)
MAP Transform (trend analysis)
See5 (decision trees and
rulesets)
WizWhy® (association rule
mining), etc.


Scientific grids: research agenda

During the last years, computational speed has been
increasing geometrically, while the speed in
communication has only experienced a linear increase.
The complexity of contemporary scientific applications
with increased demand for computing power and access
to larger datasets is setting a trend towards the increased
utilization of grids of desktop personal computers
The combination of many multicore computers in
scientific grids demand a combination of fine and coarse
grain parallelization


Desktop grids (in collaboration with CINVESTAV)

Network topology depends upon a
bandwidth availability for parallel
processes to communicate
A novel task assignment scheme
which takes the dynamic network
topology into consideration is
developed*
The approach is based on the
Bandwidth-aware Bulk Synchronous
Parallel Computer (BSP)
computational model
The force field method for
synchronisation is used
The algorithm tested for the grids
composed of 1K nodes

*E. Wilson García and G. Morales-Luna, LNCS-3795

Task assignment algorithm
Three types of applications were
studied:
High computation, low communication
cost
High computation, middle
communication cost
High computation, high communication
cost
Many parallel applications fall into the
2nd category
Distributed data applications fall into
the 3rd category


Adaptive grain parallelism

Gmandel (http://gmandel.sf.net/) is a
benchmark for computer infrastructure
generating images of the fractals from the
Mandelbrot set. The basic unit of measure
is the MMIPS (Million of Mandelbrot
Iterations Per Second).
Gmandel runs on Linux (or equivalent) on a
single computer, a multiprocesor computer
(most new multi-core PCs) or
Infiniband/Myrinet computer cluster.
In each case, Gmandel can take advantage
of multi-core technology by the use of
shared memory, fine grained and
distributed memory, coarse grain parallel
computing techniques. The first is
accomplished with posix threads and the
latter by means of MPI message passing
(currently tested with mpich2, from ANL).


Dynamic task distribution in HPC (in collaboration
with CIC-IPN)
Increased complexity of
embedded devices led to
their verification consuming
up to 70% of human and
computational resources
Dynamic planning and
distribution of HDL models
over a parallel simulation
platform (clusters with multi-
core nodes) is a challenging
task
Such a simulation platform
is being developed by a
PhD student Josué Rangel
González, in collaboration
with the Embedded Systems
Lab of CIC-IPN


Conclusions

Infrastructure next steps:
Cluster installation in the Supercomputing lab.
12 Mb I2 connection
Integration to Delta Metropolitana HPC Grid initiative
Software, taking advantage of the infrastructure:
Current platform: Landmark, Petrel, Eclipse, OFM,
opensource
Research agenda:
Novel tasks and approaches enabled by the HPC
increasing the efficiency of R&D to satisfy the needs of
PEMEX


MAyC at the IMP: publications & events

March 12-14 2012, Cancun, Mexico


Thank You! Any Questions?
Leonid Sheremetov
sher@imp.mx

Leonid sheremetov

Recommandé

Recommandé

Contenu connexe

Similaire à Leonid sheremetov

Similaire à Leonid sheremetov (20)

Plus de guadalupe.moreno

Plus de guadalupe.moreno (11)

Dernier

Dernier (20)

Leonid sheremetov