[e][e]Scripting & ETL
MySql is very quick at batch-processing data, and very simple scripts of SQL and shell-code can do all the work (plus perl if necessary).
DTS. DTS (Data Transformation Scripting?) is MS-SQL's scripting language for ETL.
Real-Time Updating
Best Practice for Real-Time Updating [forums.mysql.com]
Looping. Use a bash script to run a piece of SQL over & over again for different dates:
for i in `seq 1 20`; do mysql -hXXX -uXXX dwarehouse -e "SELECT@DATE:='`date --date="$i day ago" +%Y-%m-%d`';SOURCE update_dwarehouse.sql;"; done;
Nice Things for a Script to Have
- If the script is run without arguments, it does nothing, & just prints usage instructions.
- Use the same script to repair or backdate which you use to do the normal overnight processing.
- Have a "status" argument which you can pass
- If there's a running status report while a script runs, print the date & time at each step.
- Lock any fact/dimension tables that the script is writing to for the duration of the script.
- Known default behaviour when a query fails: ideally, the script dies and the error is printed.