![]() |
Sampling data |
Data warehouse comes in different shape and sizes, our legacy implementation of Data warehouse is Microsoft SQL Server 2008 with many tables inside. Recently we have to export a sample of all tables in our database for auditing purpose, specifically verifying there is no PII (Personal Identifiable Information) in our database.
A simple approach is doing a SELECT TOP 100 * from each table. This can be a little tedious if you have hundreds of tables. One way is to use a for loop to go through each of the tables.
This query will return the names of all the tables in the current database.
SELECT name FROM sysobjects WHERE xtype='U' order by 1;
Or if your database is organised in a schema, you will need the schema name together with the table name.
With that principle, we can loop through each table and apply SELECT TOP 100 * on each of the tables.
Well, turn out that MSSQL has a built-in stored procedure to loop through each of the tables.
Another improvement: the statement 'SELECT TOP 100 *' does just that: top 100 rows from the table. It might not give a good idea about the data in the table. The top 100 rows might contain only bad data and do not give us a grasp of the data in the table. We can use 'TABLESAMPLE' to get more random rows, a true sampling of data.
So sampling of data in tables can be achieved with a one-liner:
SELECT t.name AS table_name, SCHEMA_NAME(schema_id) AS schema_name FROM sys.tables AS t ORDER BY schema_name, table_name;
With that principle, we can loop through each table and apply SELECT TOP 100 * on each of the tables.
DECLARE @table_name nvarchar(200) DECLARE @schema_name nvarchar(200) DECLARE @sql nvarchar(max) DECLARE db_cursor CURSOR FOR SELECT t.name AS table_name, SCHEMA_NAME(schema_id) AS schema_name FROM sys.tables AS t ORDER BY schema_name, table_name; OPEN db_cursor FETCH NEXT FROM db_cursor INTO @table_name, @schema_name WHILE @@FETCH_STATUS = 0 BEGIN PRINT @schema_name + '.' + @table_name SELECT @sql = N'SELECT TOP 100 * FROM [' + @schema_name + '].[' + @table_name + ']' EXEC sp_executesql @sql; FETCH NEXT FROM db_cursor INTO @table_name, @schema_name END CLOSE db_cursor DEALLOCATE db_cursor
Well, turn out that MSSQL has a built-in stored procedure to loop through each of the tables.
exec sp_MSforeachtable @command1='SELECT TOP 100 * FROM ?'
Another improvement: the statement 'SELECT TOP 100 *' does just that: top 100 rows from the table. It might not give a good idea about the data in the table. The top 100 rows might contain only bad data and do not give us a grasp of the data in the table. We can use 'TABLESAMPLE' to get more random rows, a true sampling of data.
SELECT * FROM [TABLENAME] TABLESAMPLE(100 ROWS); OR SELECT * FROM [TABLENAME] TABLESAMPLE(1 PERCENT);
exec sp_MSforeachtable @command1='SELECT * FROM ? TABLESAMPLE(100 ROWS)'
How to make money from betting on football - Work Tomake Money
ReplyDeleteIf you're having problems finding a winning bet online for the 1xbet app day of 메이저 토토 사이트 your choosing, then there https://septcasino.com/review/merit-casino/ are plenty of titanium earrings opportunities available right หารายได้เสริม here.
They are all secure, reliable, and filled with thrilling real money casino video games and bonuses. You’ll even discover keno, bingo, and scratch playing cards among the specialty video games. Looking for a thecasinosource.com model new real money casino that accepts US players? We’re frequently testing on-line gambling websites as they hit the market.
ReplyDelete