![]() |
| Sampling data |
Data warehouse comes in different shape and sizes, our legacy implementation of Data warehouse is Microsoft SQL Server 2008 with many tables inside. Recently we have to export a sample of all tables in our database for auditing purpose, specifically verifying there is no PII (Personal Identifiable Information) in our database.
A simple approach is doing a SELECT TOP 100 * from each table. This can be a little tedious if you have hundreds of tables. One way is to use a for loop to go through each of the tables.
This query will return the names of all the tables in the current database.
SELECT name FROM sysobjects WHERE xtype='U' order by 1;
Or if your database is organised in a schema, you will need the schema name together with the table name.
With that principle, we can loop through each table and apply SELECT TOP 100 * on each of the tables.
Well, turn out that MSSQL has a built-in stored procedure to loop through each of the tables.
Another improvement: the statement 'SELECT TOP 100 *' does just that: top 100 rows from the table. It might not give a good idea about the data in the table. The top 100 rows might contain only bad data and do not give us a grasp of the data in the table. We can use 'TABLESAMPLE' to get more random rows, a true sampling of data.
So sampling of data in tables can be achieved with a one-liner:
SELECT
t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name
FROM sys.tables AS t
ORDER BY schema_name, table_name;
With that principle, we can loop through each table and apply SELECT TOP 100 * on each of the tables.
DECLARE @table_name nvarchar(200)
DECLARE @schema_name nvarchar(200)
DECLARE @sql nvarchar(max)
DECLARE db_cursor CURSOR FOR
SELECT
t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name
FROM sys.tables AS t
ORDER BY schema_name, table_name;
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @table_name, @schema_name
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @schema_name + '.' + @table_name
SELECT
@sql = N'SELECT TOP 100 * FROM [' + @schema_name + '].[' + @table_name + ']'
EXEC sp_executesql @sql;
FETCH NEXT FROM db_cursor INTO @table_name, @schema_name
END
CLOSE db_cursor
DEALLOCATE db_cursor
Well, turn out that MSSQL has a built-in stored procedure to loop through each of the tables.
exec sp_MSforeachtable @command1='SELECT TOP 100 * FROM ?'
Another improvement: the statement 'SELECT TOP 100 *' does just that: top 100 rows from the table. It might not give a good idea about the data in the table. The top 100 rows might contain only bad data and do not give us a grasp of the data in the table. We can use 'TABLESAMPLE' to get more random rows, a true sampling of data.
SELECT * FROM [TABLENAME] TABLESAMPLE(100 ROWS); OR SELECT * FROM [TABLENAME] TABLESAMPLE(1 PERCENT);
exec sp_MSforeachtable @command1='SELECT * FROM ? TABLESAMPLE(100 ROWS)'

How to make money from betting on football - Work Tomake Money
ReplyDeleteIf you're having problems finding a winning bet online for the 1xbet app day of 메이저 토토 사이트 your choosing, then there https://septcasino.com/review/merit-casino/ are plenty of titanium earrings opportunities available right หารายได้เสริม here.
Just like sampling data from multiple SQL tables requires structured queries and controlled execution, HT / LT Panels Manufacturing also relies on systematic design, organized layouts, and precise configuration. Every circuit, breaker, and control element must be correctly mapped—much like database tables—to ensure the system operates safely and efficiently. At E4E, we believe both data management and electrical engineering share the same foundation: accuracy, consistency, and a deep respect for well-structured systems.
ReplyDelete