「MySQL」- 导出数据字典

什么是数据字典

先来一段WIKI解释(有问题找WIKI)

A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a ”
centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format“.

通常情况下,我很少维护文档式的数据字典(如EXCEL),一般都是在生产库中维护,保证生产库的注释都是最新的,但是应开发的需要,有时候需要导出文档式的数据字典。下面就介绍如何导出数据字典。

为什么需要数据字典

  1. 方便开发查阅。如果数据字典只是存在于数据库的,查阅会非常麻烦。尤其是分库以后,查阅更繁琐。
  2. 方便维护。如果要修改列的注释、添加详细说明,使用数据字典来维护会非常的方便、安全,也无需执行alter语句来修改注释。

为什么是“导出”数据字典

按道理来讲数据字典在最开始就应该存在的,之所以“导出”,完全是因为在“洗锅”。

最初的时候没有维护单独的数据字典,注释只存在于数据库中,等项目大了以后,出现各种问题了,才发现我们需要一份数据字典。

每次都是,出现问题了,人们才会重视。

使用 Navicat + INFORMATION_SCHEMA.COLUMNS 导出数据字典

我最开始用的是INFORMATION_SCHEMA.COLUMNS表,后来被标题党“MySQL利用Navicat导出数据字典”唬了一下,实际上做法是一样的。

导出单个库的数据字典
首先,执行如下SQL:


SELECT TABLE_NAME AS '表名', COLUMN_NAME AS '列名', COLUMN_TYPE AS '列类型',COLUMN_COMMENT AS '列注释'
FROM information_schema.columns
WHERE TABLE_SCHEMA = 'demo';

-- 当然你还以加入其他的列,比如IS_NULLABLE等等。

然后,利用Navicat提供的数据导出功能,将查询结果直接导出到EXCEL中。这样我们就得到了demo库的数据字典。

导出全部库的数据字典
可以导出全部库的数据字典(本质一点没变,具体根据需求处理):


SELECT TABLE_SCHEMA AS '数据库库名', TABLE_NAME AS '表名', COLUMN_NAME AS '列名', COLUMN_TYPE AS '列类型',COLUMN_COMMENT AS '列注释'
FROM information_schema.columns
WHERE TABLE_SCHEMA NOT IN ('information_schema', 'mysql', 'performance_schema', 'sys');

-- 当然你还以加入其他的列,比如IS_NULLABLE等等。

然后,利用Navicat提供的数据导出功能,将查询结果直接导出到EXCEL中。

缺点及不足的地方:

  1. 输出的格式就很难让人满意。
  2. 还要调整输出的格式。

将数据字典导入MediaWWiki

用SQL生成可导入MediaWiki的文本格式。

DROP DATABASE IF EXISTS tmp4Exp;

CREATE DATABASE tmp4Exp;

USE tmp4Exp;


DROP PROCEDURE IF EXISTS `GDD`;

DELIMITER $$

CREATE DEFINER=`root`@`localhost` PROCEDURE `GDD`()

BEGIN


	DECLARE tmp_schema VARCHAR(64) DEFAULT '';

	DECLARE tmp_table VARCHAR(64) DEFAULT '';

	DECLARE tmp_col_name VARCHAR(64) DEFAULT 'UNKNOWN COLUMN NAME';

	DECLARE tmp_col_type LONGTEXT DEFAULT 'UNKNOWN TYPE';

	DECLARE tmp_col_comment VARCHAR(1024) character set utf8 DEFAULT '';

	DECLARE tmp_col_isnullable VARCHAR(16) DEFAULT '';

	DECLARE tmp_col_key VARCHAR(16) DEFAULT '';

	DECLARE tmp_col_extra VARCHAR(16) DEFAULT '';

	DECLARE tmp_col_default VARCHAR(64) DEFAULT '';

	DECLARE tmp_table_comment VARCHAR(1024) character set utf8 DEFAULT '' ;



	DECLARE currentSchema VARCHAR(64) DEFAULT '';

	DECLARE currentTable VARCHAR(64) DEFAULT '';

	DECLARE resultContent MEDIUMTEXT;

	DECLARE is_end INT DEFAULT 0;



	DECLARE dataCurosr CURSOR FOR

		SELECT

			c.`TABLE_SCHEMA`,

			c.`TABLE_NAME`,

			c.`COLUMN_NAME`,

			c.`COLUMN_TYPE`,

			c.`COLUMN_COMMENT`,

			c.`IS_NULLABLE`,

			c.`COLUMN_KEY`,

			c.`EXTRA`,

			c.`COLUMN_DEFAULT`,

			t.`TABLE_COMMENT`

		FROM

			information_schema.COLUMNS AS c

				INNER JOIN

			information_schema.TABLES AS t ON t.`TABLE_NAME` = c.`TABLE_NAME` and t.`TABLE_SCHEMA` = c.`TABLE_SCHEMA`

		WHERE

				c.`TABLE_SCHEMA` NOT IN ('information_schema' , 'mysql', 'sys', 'performance_schema')

		ORDER BY `TABLE_SCHEMA` DESC , `TABLE_NAME` DESC , `ORDINAL_POSITION` ASC;



	DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET is_end = 1;



	SET resultContent = '';


		DROP TABLE IF EXISTS `tmpStroage`;

		-- 结果模版,保存结果

		CREATE TEMPORARY TABLE tmpStroage

		(

		`content` VARCHAR(65535) CHARACTER SET utf8

		) DEFAULT CHARSET=utf8;


		OPEN dataCurosr;



		FETCH dataCurosr into tmp_schema, tmp_table, tmp_col_name, tmp_col_type, tmp_col_comment, tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_default, tmp_table_comment;


	WHILE is_end != 1 DO



				-- 遇到新的数据库

				IF tmp_schema != currentSchema OR tmp_schema = '' THEN

			SET currentSchema = tmp_schema;

			INSERT INTO tmpStroage VALUES

				('#######'),

				(currentSchema);

				END IF;



				-- 如果遇到一张新表

		IF tmp_table != currentTable OR tmp_table = '' THEN

			SET currentTable = tmp_table;

			INSERT INTO tmpStroage VALUES

				(concat('【', currentTable, '】,',tmp_table_comment)),

				('{| class="wikitable"'),

				('|-'),

				(concat('!', concat_ws('!!', '列名', '列类型', '默认值', '是否允许为空', 'KEY类型', '附加信息', '列注释')));

		END IF;



				-- 保存列的信息

		INSERT INTO tmpStroage VALUES ('|-'),

		(concat('|', concat_ws('||',tmp_col_name, tmp_col_type, ifnull(tmp_col_default, ''), tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_comment)));



		SET resultContent = CONCAT(resultContent, tmp_table_comment);


		FETCH dataCurosr into tmp_schema, tmp_table, tmp_col_name, tmp_col_type, tmp_col_comment, tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_default, tmp_table_comment;



		-- 如果下一张表为新表,添加表格结尾信息

		IF tmp_table != currentTable THEN

			insert into tmpStroage values ('|}');

		END IF;



	END WHILE;

	CLOSE dataCurosr;


	-- 最后一张表的收尾

	INSERT INTO tmpStroage VALUES ('|}'), ('#######'), ('');


 	SELECT * FROM tmpStroage;


END$$

DELIMITER ;


CALL GDD();

DROP DATABASE

IF EXISTS tmp4Exp;

下面是一个妄图提高性能,却还不如原有性能的做法,执行时间194秒:


DROP PROCEDURE IF EXISTS `GDD`;
DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `GDD`()
BEGIN

	DECLARE tmp_schema VARCHAR(64) DEFAULT '';
    DECLARE tmp_table VARCHAR(64) DEFAULT '';
    DECLARE tmp_col_name VARCHAR(64) DEFAULT 'UNKNOWN COLUMN NAME';
    DECLARE tmp_col_type LONGTEXT DEFAULT 'UNKNOWN TYPE';
    DECLARE tmp_col_comment VARCHAR(1024) DEFAULT '';
    DECLARE tmp_col_isnullable VARCHAR(16) DEFAULT '';
    DECLARE tmp_col_key VARCHAR(16) DEFAULT '';
    DECLARE tmp_col_extra VARCHAR(16) DEFAULT '';
    DECLARE tmp_col_default VARCHAR(64) DEFAULT '';
    DECLARE tmp_table_comment VARCHAR(1024) DEFAULT '';

    DECLARE resultContent mediumtext;

	DECLARE currentSchema VARCHAR(64) DEFAULT '';
    DECLARE currentTable VARCHAR(64) DEFAULT '';

    DECLARE is_end INT DEFAULT 0;

    DECLARE dataCurosr CURSOR FOR
		SELECT
			c.`TABLE_SCHEMA`,
			c.`TABLE_NAME`,
			c.`COLUMN_NAME`,
			c.`COLUMN_TYPE`,
			c.`COLUMN_COMMENT`,
			c.`IS_NULLABLE`,
			c.`COLUMN_KEY`,
			c.`EXTRA`,
            c.`COLUMN_DEFAULT`,
			t.`TABLE_COMMENT`
		FROM
			information_schema.COLUMNS AS c
				INNER JOIN
			information_schema.TABLES AS t ON t.TAble_name = c.table_name
		WHERE
			c.`TABLE_SCHEMA` NOT IN ('information_schema' , 'mysql', 'sys');

    DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET is_end = 1;

    OPEN dataCurosr;

    FETCH dataCurosr into tmp_schema, tmp_table, tmp_col_name, tmp_col_type, tmp_col_comment, tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_default, tmp_table_comment;

	WHILE is_end != 1 DO

        -- 遇到新的数据库
        IF tmp_schema != currentSchema OR tmp_schema = '' THEN
			SET currentSchema = tmp_schema;
            SET resultContent = concat('#######\n', currentSchema, '\n');
        END IF;

        -- 如果遇到一张新表
		IF tmp_table != currentTable OR tmp_table = '' THEN
			SET currentTable = tmp_table;
			SET resultContent = concat(resultContent, concat('表名:', currentTable, ',',tmp_table_comment), '\n');
			SET resultContent = concat(resultContent, concat('{| class="wikitable"'), '\n');
			SET resultContent = concat(resultContent, concat('|-'), '\n');
			SET resultContent = concat(resultContent, concat(concat('!', concat_ws('!!', '列名', '列类型', '默认值', '是否允许为空', 'KEY类型', '附加信息', '列注释'))), '\n');
        END IF;

        -- 保存列的信息
        SET resultContent = concat(resultContent, concat('|-'), '\n');
        SET resultContent = concat(resultContent, concat((concat('|', concat_ws('||',tmp_col_name, tmp_col_type, ifnull(tmp_col_default, ''), tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_comment)))), '\n');

        FETCH dataCurosr into tmp_schema, tmp_table, tmp_col_name, tmp_col_type, tmp_col_comment, tmp_col_isnullable, tmp_col_key, tmp_col_extra, tmp_col_default, tmp_table_comment;


        -- 如果下一张表为新表,添加表格结尾信息
		IF tmp_table != currentTable THEN
			SET resultContent = concat(resultContent, concat('|}'), '\n');
        END IF;

    END WHILE;

    -- 最后一张表的收尾
    SET resultContent = concat(resultContent, concat('|}'), '\n');

    CLOSE dataCurosr;

SELECT resultContent;

END$$
DELIMITER ;

CALL GDD();

导入MediaWiki的shell脚本

#!/bin/bash

#!/bin/bash

# 获取文档的编辑版本
function get_version()
{
    edit_page=$(curl 'http://10.10.50.228/db_wiki/index.php?title='${1}'&action=edit')
    edit_version=$(echo "$edit_page" | sed -E  's/<input[^<]*value="([0-9]{3})".*name="parentRevId"\/>/#\1#/p'  -n | cut -d \# -f 2)
    echo $edit_version
}

# 向服务器发送数据
function send_data()
{
    # db_name $1
    # content $2
    # version 7
    typeset -u db_name
    db_name=$1
    echo '--------------'

    version=$(get_version $db_name)
    echo 'SEND DATA: DB_NAME: '$db_name', DB_VERSION: '$version

    # --data-bainary @-
    # 从stdin中读取,解决POST数据过长的问题。
    curl -x 10.10.29.131:8080 'http://10.10.50.228/db_wiki/index.php?title='${db_name}'&action=submit' \
	 -H 'Host: 10.10.50.228' \
	 -H 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:54.0) Gecko/20100101 Firefox/54.0' \
	 -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' \
	 -H 'Accept-Language: en-US,en;q=0.5'  --compressed \
	 -H 'Content-Type: multipart/form-data; boundary=---------------------------24444684628010' \
	 -H 'Cookie: db_wiki_session=9hgmatoqsmfvqjlj5pi880b1kvfd8cu3; cpPosTime=1499329935.5279' \
	 -H 'Connection: keep-alive' \
	 -H 'Upgrade-Insecure-Requests: 1' \
	 --data-binary @- <<EOF
-----------------------------24444684628010
Content-Disposition: form-data; name="wpAntispam"


-----------------------------24444684628010
Content-Disposition: form-data; name="wpSection"


-----------------------------24444684628010
Content-Disposition: form-data; name="wpStarttime"

20170706083222
-----------------------------24444684628010
Content-Disposition: form-data; name="wpEdittime"

20170706083215
-----------------------------24444684628010
Content-Disposition: form-data; name="editRevId"

$version
-----------------------------24444684628010
Content-Disposition: form-data; name="wpScrolltop"

0
-----------------------------24444684628010
Content-Disposition: form-data; name="wpAutoSummary"

d41d8cd98f00b204e9800998ecf8427e
-----------------------------24444684628010
Content-Disposition: form-data; name="oldid"

0
-----------------------------24444684628010
Content-Disposition: form-data; name="parentRevId"

$version
-----------------------------24444684628010
Content-Disposition: form-data; name="format"

text/x-wiki
-----------------------------24444684628010
Content-Disposition: form-data; name="model"

wikitext
-----------------------------24444684628010
Content-Disposition: form-data; name="wpTextbox1"

$2

-----------------------------24444684628010
Content-Disposition: form-data; name="wpSummary"


-----------------------------24444684628010
Content-Disposition: form-data; name="wpSave"

保存更改
-----------------------------24444684628010
Content-Disposition: form-data; name="wpEditToken"

+\\
-----------------------------24444684628010
Content-Disposition: form-data; name="mode"

text
-----------------------------24444684628010
Content-Disposition: form-data; name="wpUltimateParam"

1
-----------------------------24444684628010--

EOF

    echo 'DONE'
    echo '-----------------------------------------------------------'

    # 休眠,让服务器缓一下
    sleep 5

}

# out.log文件保存了有GDD存储过程生成的数据。
cat /tmp/out.log | while read line
do
    # db_content=''

    db_content=$db_content
    if [ "$line" == '#######' ]
    then
	# 发送上一轮的数据
	prev_db_name=$db_name
	if [ "$db_content" != "" ]
	then
	    send_data "$prev_db_name" "$db_content"
	    # 清除上一轮数据, 准备存储本轮数据
	    db_content=''
	fi

	# 读取新的数据库
	read db_name
	if [ "$db_name" == "" ]
	then
	    echo "ALL DONE!!!"
	    exit
	else
	    echo '-----------------------------'
	    echo 'NEW DB: '$db_name
	fi


    else
	db_content=$(cat <<EOF
$db_content
$line
EOF
		  )
	# db_content=$db_content$(echo)$line
	# db_content=`date +%s`

    fi
done
# && send_data $db_name $db_content

参考文献

1分钟导出MySQL的数据字典:http://suifu.blog.51cto.com/9167728/1851898
Wikipedia / Data Dictionary:https://en.wikipedia.org/wiki/Data_dictionary