Amr Saafan: System Metadata Tables and INFORMATION

Monday, June 1, 2009

System Metadata Tables and INFORMATION_SCHEMA in SQL Server 2005

Information about assemblies as well as the assembly code itself and the dependencies is stored in the system metadata tables, which, in general, store information about SQL Server database objects, such as tables and indexes. Some metadata tables store information for the entire database instance and exist only in the MASTER database; some are replicated in every database, user databases as well as MASTER. The names of the tables and the information they contain are proprietary.

System metadata tables are performant, however, because they reflect the internal data structures of SQL Server. In the big rewrite that took place in SQL Server 7, the system metadata tables remained intact. In SQL Server 2005, the metadata tables have been overhauled, revising the layout of the metadata information and adding metadata for new database objects.

In addition, programmers and DBAs can no longer write to the system metadata. It is really a read only view. The SQL INFORMATION_SCHEMA, on the other hand, is a series of metadata views defined by the ANSI SQL specification as a standard way to expose metadata.

The views evolve with the ANSI SQL specification; SQL:1999 standard INFORMATION_SCHEMA views are a superset of the SQL-92 views. SQL Server 2000 supports the INFORMATION_SCHEMA views at the SQL-92 standard level; some of the SQL:1999 views may be added in SQL Server 2005.
SQL Server is, so far, the only major database to support the INFORMATION_SCHEMA views.

Getting Metadata from SQL Server:

-- this uses the system metadata tables
SELECT * FROM sysobjects WHERE [type] = 'U'
-- this uses the INFORMATION_SCHEMA
SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE'

SQLServer 2005 includes a reorganization of the system metadata tables. This includes renaming the tables to use an arbitrary schema (named SYS) as well as table renames and reorganization of some of the information. The goal, once again, is speed and naming consistency.

The equivalent query to the previous two using the new system metadata tables would be as follows:

SELECT * FROM SYS.TABLES

Note that the information returned by all three queries differs both in the number of columns returned, the column names used, and the information in the resultset. Information about assemblies and the assembly code itself is stored in three metadata tables. These tables exist on per database, since assemblies are scoped to the database and schema.

Sys.assemblies stores information about the assembly itself as well as schema_id, assembly_id, and the .NET version number. The assembly dependencies are stored in sys.assembly_references, one row per assembly-reference pair. Finally, the assembly code itself is cataloged in sys.assembly_files.

In all cases, this table contains the actual code rather than the name of the file where the code resided when it was cataloged. The original file location is not even kept as metadata. In addition, if you have added a debugger file, using the ALTER ASSEMBLY ADD FILE DDL statement, the debug information will appear as an additional entry in the sys.assembly_files table.

Notice that you can define an assembly that is “invisible” with respect to defining routines and types to the runtime. Lack of visibility is the default when SQL Server loads dependent assemblies of an assembly defined using CREATE ASSEMBLY. You might do this, for example, to define a set of utility routines to be invoked internally only.

If you specify IsVisible=true (the default) this means that methods and types in this assembly can be declared as SQL Server methods and types, either through the “list” properties or directly through DDL.

Source: https://www.nilebits.com/blog/2009/08/system-metadata-tables-and-information_schema-in-sql-server/