4. Agenda
α El por qué de una herramienta de Integración
α Novedades en Denali CTP3
β Orientadas al Desarrollador
β Implementación, configuración y administración
α Data Quality Services
9. Como Developer:
¿Has tenido que….
α Implementar variables expuestas en
configuración?
α Pasar de variables en patrones Parent –
Child?
α Generar de nuevo algún componente
que has eliminado por error?
α Añadir anotaciones para documentar
un paquete?
α Mapear componentes dependientes al
sustituir un componente que genera
nuevos id de linaje?
α Buscar elementos con expresiones?
α Esperar a que se resuelva una conexión
(Validating…)
10. Novedades SQL Server Denali – Integration
Services
Mejoras en el interfaz de usuario Implementación, configuración
Ayuda a los nuevos usuarios y administración
Aumento de la productividad Servidor SSIS
Administradores de conexiones
compartidos Parámetros
Resolución de problemas a
Requisitos de Clientes través del SSMS
Deshacer
Flexibilidad en la autoría
(FOoA)
Formato DTSX (XML)
14. Qué necesitas saber
αTodo lo que tienes ahora sigue funcionando en
Denali
β Excepto…
γ Tareas Ejecutar Paquetes DTS 2000
γ Tareas ActiveX Script
αMigrar los proyectos al nuevo modelo es opcional
β La mayoría de los proyectos obtendrán beneficios
15. Preparando el cambio
para el Project Deployment Model
αLos Proyectos
γ Contienen paquetes dependientes? Tareas Ejecutar paquetes?
αParámetros
β Se utilizan configuraciones?
β Se comparten configuraciones entre paquetes?
αAdministradores de conexión compartidos
β Mis paquetes utilizan conexiones comunes?
αEjecución en el servidor
γ Hay algún paquete almacenado externamente?
16. Migrando proyectos anteriores
α Asistente para la conversión
de proyectos
β Piensa en parámetros de
proyecto en lugar de
configuraciones compartidas
entre paquetes
β No se requieren conexiones
para las tareas Ejecutar
paquete
α Redefinir en BIDS
β Considera utilizar Conexiones
Compartidas globales
β Actualiza las tareas Ejecutar
paquete para resolver las
referencias basadas en
expresiones
20. The Server Gives You…
Configuration Security Management
Set values for Transparent
parameters encryption of Interactive package
projects and execution and SQL
Central connection parameter values Agent integration
manager
configuration Row-level security to Dashboard and built
control access to in reports for
Advanced property packages troubleshooting
override functionality
25. Troubleshooting Information
10,000 foot view 5000 foot view Ground view
Now • What is the current
status of this
• How many rows
have been
• How many buffers
are used by this
Query execution state
package? transferred so far? execution?
while package is
• How long has it • Which phases has • How many buffers
running
been running? my component have been spooled
completed? to disk?
• Can I debug this
process?
Historical • When this error
occurred, what was
• How much
memory was
• Memory Dump files
(binary and textual)
Control which
the state of available the last
information is
components, time this package
captured at a server
connection was run?
level
manager, and • How many rows
variables does this package
• How long did this usually transfer?
package take to • How much time is
run in the past? spent in each of
the components in
this data flow?
26. Performance Issues
“These packages were running
well for the past 6 months, taking
less than an hour to complete.
Last night’s run took over 7
hours!”
“Where are the bottlenecks in my
package?”
“Can I get instance-specific
package execution counter
information?”
27. Component Timing &
Performance Counters
Ability to find out time spent in the Querying Performance
data flow components Information for a
running execution
dm_execution_performance_counter
Validate Pre Execute ProcessInput ProcessInput Post Execute
Time
SELECT package_name, task_name, subcomponent_name,
SUM(DATEDIFF(ms,start_time,end_time)) as active_time, select *
DATEDIFF(ms,min(start_time),max(end_time)) as total_time from catalog.dm_execution_performance_counters
FROM catalog.execution_component_phases (@execution_id)
WHERE execution_id = 1841
GROUP BY package_name, task_name,
subcomponent_name, execution_path
ORDER BY package_name, task_name,
subcomponent_name, execution_path
28. Data Issues
“Some values in our data
warehouse don’t look right. What
went wrong?”
“No rows written for the last
nightly load. Are we dropping
data?”
“The package works on my dev
box... why is it failing in the
production machine?
29. Data Tap & Row Counts
Ability to find out the number of
rows transferred on the server
Ability to perform data tap
SELECT package_name, task_name,
source_component_name,
destination_component_name,
rows_sent
FROM catalog.execution_data_statistics
WHERE execution_id =1836
ORDER BY source_component_name,
destination_component_name
34. Limpieza de Datos.
id dni nombre fecha
1 1232 Paco 1
2 1232 Francisco A. 2
3 1232 Francisco Gonzalez 4
4 1234 Victoria Sanchez 5
5 1234 Victor Sanchez 9
sid nombre dni fecha id
1 Francisco Gonzalez 1232 1 1
1 Francisco Gonzalez 1232 2 2
1 Francisco Gonzalez 1232 4 3
2 Victor Sanchez 1234 5 4
2 Victor Sanchez 1234 9 5
36. Common Data Quality Issues
Data Issue Sample Data Problem
Quality
Standard Are data elements Gender code = M, F, U in one system
consistently defined and and Gender code = 0, 1, 2 in another
understood ? system
Complete Is all necessary data present 20% of customers’ last name is blank,
? 50% of zip-codes are 99999
Accurate Does the data accurately A Supplier is listed as ‘Active’ but
represent reality or a went out of business six years ago
verifiable source?
Valid Do data values fall within Salary values should be between
acceptable ranges? 60,000-120,000
Unique Data appears several times Both John Ryan and Jack Ryan
appear in the system – are they the
same person?
37. DQS High Level Scenarios
• Creating and managing the Data Quality
Knowledge Knowledge Bases
Management & • Discover knowledge from your org’s data samples
Reference Data • Exploration and integration with 3rd party
reference data
Cleansing & • Correction, de-duplication and standardization of
Matching the data
• Tools to monitor and control data quality
Administration processes
38. Batch Cleansing - Using SSIS
SSIS Data Flow
SSIS Package
Source Data
Values/Rules + correction Destination
Mappin Component
g
Reference Data
Definition
Microsoft Confidential—Preliminary Information Subject to
Change
41. No olvideis rellenar las evaluaciones en el Portal
del Summit!
Nos encontrareis en la zona de exposición en los
siguientes horarios
α Todos los días de 09:30 a 18:00 horas
Francisco González Víctor Sánchez
Mentor | Researcher DPA – BI Division
SQL Server MCITP, MCT MCITP BI SQL 2008 | MCC 2011
FranciscoAGonzalez@SolidQ.com VSanchez@SolidQ.com
42. Francisco González Víctor Sánchez
Mentor | Researcher DPA – BI Division
SQL Server MCITP, MCT MCITP BI SQL 2008 | MCC 2011
FranciscoAGonzalez@SolidQ.com VSanchez@SolidQ.com