tm-program-20141127

「ＴＭの会」プログラム	▲ このウインドウを閉じる
/ 2014年11月27日 /

　

　● A Relational Model of Data for Large Shared Data Banks　（E.F. Codd, 1970）
　1.3. A Relational View of Data
　The term relation is used here in its accepted mathematical sense. Given sets S₁, S₂, ・・・, S_n (not necessarily distinct), R is a relation on these n sets if it is a set of n-tuples each of which has its first element from S₁, its second element from S₂, and so on.¹
　¹ More consicely, R is a subset of the Cartesian product S₁ × S₂ ×・・・× S_n.
　Relation （関係）は、直積集合の部分集合である、ということ。
　数学上、一般形は、R { x₁ ∈ S₁, x₂ ∈ S₂,・・・, x_n ∈ S_n ∧ P (x₁, x₂,・・・, x_n) }.
　この式は「選択公理」からも説明できます──空でない集合から 1つずつメンバーを選んで並べたら、a tuple （n-ary, n 項関係）になる、と。ちなみに、a tuple も集合の 1つです。
　R is said to have degree n. Relations of degree 1 are often called unary, degree 1 binary, degree 3 ternary, and degree n n-ary.
　Degree は「次数」のこと。
　数学では、ふつう、2 項関係（binary）を使います [ TM もそうです ]。リレーショナル・モデルでは、データの集まりを n 項関係（n-ary）としてとらえています。
　An array which represents an n-ary relation R has the following properties:
　（1） Each row represents an n-tuple of R.
　（2） The ordering of rows is immaterial.
　（3） All rows are distinct.
　（4） The ordering of columns is significant──it corresponds to the ordering S₁, S₂,・・・, S_n
　　　of the domains on which R is defined (see, however, remarks below on domain-ordered
　　　and domain-unordered relations).
　（5） The significance of each column is partially conveyed by labeling it with the name of
　　　the corresponding domain.
　（1）それぞれの row は「関係（relation）」の n 項関係である。
　（2） Row のあいだの並び（それぞれの row をどのように並べるか）は、直積集合の中では、
　　　大切な事ではない。
　　　immaterial;　unimportant under the circumstances, irrelevant.
　（3）すべての row は、他の row から独立したものとして認識される。
　（4） colum のあいだの並び（それぞれの columm をどのように並べるか）は意味がある──
　　　　「関係」が定義されている集合（domain）間の並びに対応している。
　（5）それぞれの column の意味は、対応する集合（domain）に対して付けられた名称に
　　　よって或る程度伝えられる。
　直積集合も集合の 1つなので、集合の性質上、そのメンバーの並びは問わない。数学上、集合とは、「構成（メンバーの相互関係）」を取り敢えず無視して、バラバラのメンバーを集積したものである、と考えて下さい。集合論の生みの親であるカントール氏に先立ってボルツァール氏は、「壊れたコップ」の例を示して、「構成」とは無関係な概念としての集合を説明しています──1 つの壊れていない完全なコップと、壊れたコップを比較すると、その 2つのものは確かに同一の部分から成り立っていますが、そのおのおのの断片どうしの結びつきかたや並べかたは、まるで異なっています。そのとき、おのおのの組成の結びかたや並びかたには無関係に定まる概念を集合と云います。
　The significance of each column is partially conveyed というふうにコッド氏が述べた理由は、次に述べる「再帰」の問題があるからでしょう。
　One might ask: If the columns are labeled by the name or corresponding domains, why should the ordering of columns matter? As the example in Figure 2 shows, two columns may have identical headings (indicating identical domains) but possess distinct meanings wiht respect to the relation. The relation depicted is called component. It is a ternary relation, whose first two domains are called part and third domain is called quantity. The meaning of component (x, y, z) is that part x is an immediate component (or subassembly) of part y, and z units of part x are needed to assemble one unit of part y. It is a relation which plays a critical role in the parts explosion problem.
　Users should not normally be burdened with remembering the domain ordering of any relation (for example, the ordering supplier then part, then project, then quantity in the relation supply. Accordingly, we propose that users deal, not with relations which are domain-ordered, but with relaionships which are their domain-unordered counterparts.² To accomplish this, domains must be uniquely identifiable at least within any given relation, without using position. Thus, where there are two or more identical domain, we require in each case that the domain name be qualified by a distinctive role name, which serves to identify the role played by that domain in the given relation. For example, in the relation component of Figure 2, the first domain part might be qualified by the role name sub, and the second by super, so that users could deal with the relationship component and its domains--sub.part super.part, quantity--without regard to any ordering between these domains.
　To sum up, it is proposed that most users should interact with a relational model of the data consisting of a collection of time-varying relationships (rather than relations). Each user needs not know more about any relationsip than its name together with the name of its domains (role qualified whenever necessay).³
　² In mathematical terms, a relationship is an equivalence class of those relations that are equivalent under permutation of domains
　permutation;　Math. action of changing the arrangement of a set of items.
　³ Naturally, as with any data put into and retrieved from a computer system, the user will normally make far more effective use of the data if he is aware of its meaning.
　Column は対応する集合（domain）の名称が付されているのに、どうして column のあいだの並びが問題となるのか──コッド氏は、ここで component (part, part, quantity) の例を示しています。一般的に云えば、再帰（recursive）の問題です。たとえば、集合（domain）として「部品」があって、関係（relation）として「（部品）構成」を表すとき、domain 名を使えば、「（部品）構成」は（部品 (R), 部品 (R)）と記述します──ここで (R) は、TM で使っている記述ですが、集合（domain）を指示している、という意味です。そして、column を並べるときの規則として（ちなみに、n-tuple では、relation の一般式で示した様に、それを構成する項はなんらかの規則で並べられていなければならないのですが）、最初に記述される部品は「子部品」を示し、次に記述される部品は「親部品」を示す、とします（その逆の並びとして（親部品、子部品）という組でもよい）。しかし、1つの tuple （relation）が多数の column から構成されている場合には──コッド氏は a degree of 30 is not at all uncommon と言っていますが──、domain がどのように並べられているか覚えておく負荷をユーザに課すのは尋常じゃない、とコッド氏は言っています。したがって、domain の並びを前提にした「関係（relation）」を使うのではなくて、それに相当しているが domain の並びを意識しない「関連（relationship）」として扱う事をコッド氏は配慮しています。
　Access-name （実装名称）として｛親部品、子部品｝を使う事には、私（佐藤正美）はコッド氏の提案に反対しない（TM も実装上そうしている）。しかし、describtive-name （モデル上の「論理」名）として｛親部品、子部品｝を使うのには賛同しかねる（コッド氏も、理論上、｛部品、部品｝とするのが正しいけれど、運用上、難があるので｛親部品、子部品｝とする事を配慮しているのですが）。「親子」というのは、「関係」の中で言及される名称であって、集合（セット、素の集合）を指示した名称ではない。Describtive-name は、素の集合を使って「関連」を構成する規則をモデルの文法としているなら、どの集合（あるいは、集合のメンバー）を使って、「関連」を構成したかを明示しなければならないので、｛部品 (R)、部品 (R) ｝として記述するのが正論です。すなわち、「関連」の構成上で、（「意味」ではなくて──ここで云う「意味」は文脈上で定まる meaning をいい、事実（現実的なモノ）を指示する sense と云うことではない──）どの集合を項としているのかを明らかにしなければならない。
　コンピュータに入力され参照されるどんなデータも、ユーザがその意味（meaning）を知っていれば、遙かに effective に活用できるだろう、とコッド氏は述べていますが、（難癖をつける様で申し訳ないのですが） effective であるかどうかの論点ではなくて、efficient の論点ではないかと私は思います。なぜなら、モデルが記述する集合（domain）の「意味」は、「符牒（語-言語）と事実的対象」の指示関係のなかで「充足された値」の表す名称が付けられるのであって、モデル上、値の充足を「解釈」と云うだから。実は、この問題 [ domain 名称か relation 上の名称か、という問題 ] は、「意味の対応説」と「意味の使用説」という難しい論点なのです。モデル制作では「意味の対応説」を従い、モデルを実装する時には、「意味の使用説」に従う、というのが私のやりかたです。たとえば、｛倉庫 (R)、棚 (R)、商品 (R) ｝は、モデル上（あるいは、構文論上）、「倉庫」「棚」「商品」から構成される tuple を示していますが、その意味（meaning）は「在庫」を表します。なお、この問題に興味のある人は、ウィトゲンシュタイン氏の著作（「論理哲学論考」と「哲学探究」）を読んでみて下さい。
（この section は次回に続く）
　　→ 板書写真
　
　★ 英文を読んでわからない単語は、英英辞典をできるだけ使うようにして下さい。英英辞典を使うコツは、たとえばわからない単語を調べるよりも、「すでに知っている（と思われる）単語」を調べるようにして下さい。というのは、「すでに知っている（と思われる）単語」は、日本語の訳で覚えているので、往々にして、英語のニュアンスとズレて覚えているからです。したがって、英英辞典が難しい英単語を簡単な英単語で説明していても、その簡単な英語を日本語で覚えているので、英英辞典の説明を日本語に翻訳してわかったつもりになる危険性が高い。
　たとえば、コッド論文のなかの例では、The totality of data in a data bank may be viewed as a collection of time-varying relations の文中で totality を把握する [ 頭のなかで像を描く ] のは難しいでしょう── all や whole を使わなかった理由を考えてみて下さい。日本語で「全体、総計」と訳しても具体的な像が描きにくい。Total には、everything without exception has been counted というニュアンスがある。他の例では、The ordering of rows is immaterial において、immaterial を使った理由を考えてみて下さい。immaterial の反対語は material です──実装上考慮されないという意味でしょう、 not relevant に近い、relevant の意味は closely connected or appropriate to the current matter。つまり、immaterial は irrelevant （＝ unimportant under the circumstances）ということ。
　どの言語でもそうだと思いますが、一流の書き手は、語彙と文体に対してとても注意を払って文を書きます。読み手のほうも、それに応えるように読まなければならない。英文を読むときには、英英辞典をつねに手元に置いて、わかっていると思っている単語でも面倒がらずに調べるようにして下さい。

　

▲ ページのトップ

　

　 ▼ このウインドウを閉じる