Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Super Macho Man

macrumors 6502a
Original poster
Jul 24, 2006
505
0
Hollywood, CA
So if I've got a table of books, with columns "id," "title," and "language." I'm using the 2-letter ISO language codes, like en, fr, de, jp, etc. This works well (although it's not normalized) for books that are written in a single language. Currently, book.language is a varchar(10) which is long enough to hold up to 4 languages separated by spaces. But this is bad design.

So what do I do if a book is written in more than one language? I can create a "language" table with all languages in it, and set up book.language as a foreign key to language.id or whatever. But how can I assign multiple languages to a book, without creating multiple book.language columns, unless that's what I have to do?
 

LtRammstein

macrumors 6502a
Jun 20, 2006
570
0
Denver, CO
Well... There's two ways to go about this. Mind you when I work on a database, I try to find a common table to use as a key, that way I can easily search for desired data.

I would personally make a Language table, and have the book's ID as a key to that. From there you can specifiy how many columns you want. As a cell for those columns, I would make it a Yes/No data input.

I know in MS Access you can have a memo field, so you could just put the 2-letter ISO language code in there seperated by spaces or commas.

I'd rather have a lot of tables with each table taking up very little space than a few tables taking up a lot of space. Plus, I find that search in many tables with very little space to be faster.

Steve
 

toddburch

macrumors 6502a
Dec 4, 2006
748
0
Katy, Texas
Three tables

You could have your book table (with ID as primary key), with no language info.

Then, you could have a language table, with the ISO code column as a primary key, with no book info.

Finally, you could have a book/language table, that ties the book to an available language, with a book ID and Language code both as foreign keys. Insert as many rows as the book is available in languages.

The database manager will use the language table to make sure you are inserting a valid langauge code in the book/language table, just as it will use the book table to make sure you are inserting a valid book in the book/language table.

Ain't referential integrity great?! You just took tons of logic and condition checking out of your program and dumped it on the database manager.

Todd
 

iSee

macrumors 68040
Oct 25, 2004
3,540
272
I'd set it up like this:
Code:
Book Table
----------
bookid
title

BookLanguage Table
-------------------
bookid
langid

The key in BookLanguage is the combination of bookid and langid.

You have multiple entries in BookLanguage for a particular book to indicate more than one language (that is, multiple tuples with the same bookid but different langids).

And you'd have one entry in BookLanguage for a book with one language.

(Also, you could extend this with a Language table with langid as the key if you wanted to track additional fields relating to an entire language.)

Edit: ah, I see toddburch beat me to it.
 

Super Macho Man

macrumors 6502a
Original poster
Jul 24, 2006
505
0
Hollywood, CA
OK, I'm reading more about this and it sounds like the join table is the way to go for a many-to-many relationship like this. So I've got the book/language join table set up (book_lang_join) with the foreign keys, but it's empty. When I insert a new book for example, do I update book.lang or book_lang_join.bookid? Will the values in the foreign key columns of book_lang_join automatically propagate to the book and language tables?

What about the data I already have in the book.language and language.name columns? Is there a way I can populate book_lang_join automatically with an insert statement? A way to somewhat automate this...

(This is with PostgreSQL by the way :) )
 

savar

macrumors 68000
Jun 6, 2003
1,950
0
District of Columbia
Finally, you could have a book/language table, that ties the book to an available language, with a book ID and Language code both as foreign keys. Insert as many rows as the book is available in languages.

Bingo...this is called a junction or join table. It's not terribly space efficient, but its the best design solution for speed. (Assuming you create the correct indices on each of the three tables.)

Super_Macho_Man said:
OK, I'm reading more about this and it sounds like the join table is the way to go for a many-to-many relationship like this. So I've got the book/language join table set up (book_lang_join) with the foreign keys, but it's empty. When I insert a new book for example, do I update book.lang or book_lang_join.bookid? Will the values in the foreign key columns of book_lang_join automatically propagate to the book and language tables?

What about the data I already have in the book.language and language.name columns? Is there a way I can populate book_lang_join automatically with an insert statement? A way to somewhat automate this...

You have to update each table separately. book_lang_join should be the only table with RI constraints, so you insert into the book table first (presumably, the language table is pre-filled with correct metadata), then insert into join table. If you did it the other way around you'd get an error.

I don't know about Postgre, but in Oracle you can write a stored procedure to do this process in 1 step:
1) Get next in sequence ID for the book table.
2) Insert a row into the book table using unique ID.
3) Insert unique ID & language pairs into join table.
 

iSee

macrumors 68040
Oct 25, 2004
3,540
272
OK, I'm reading more about this and it sounds like the join table is the way to go for a many-to-many relationship like this. So I've got the book/language join table set up (book_lang_join) with the foreign keys, but it's empty. When I insert a new book for example, do I update book.lang or book_lang_join.bookid? Will the values in the foreign key columns of book_lang_join automatically propagate to the book and language tables?

What about the data I already have in the book.language and language.name columns? Is there a way I can populate book_lang_join automatically with an insert statement? A way to somewhat automate this...

(This is with PostgreSQL by the way :) )

It sounds like you still have a language column in book (book.lang in the quote). You shouldn't. Everything relating languages to books is in your book_lan_join table.
 

Super Macho Man

macrumors 6502a
Original poster
Jul 24, 2006
505
0
Hollywood, CA
It sounds like you still have a language column in book (book.lang in the quote). You shouldn't. Everything relating languages to books is in your book_lan_join table.

Yeah... but is there a way I can "move" the data in book.lang to the book_lang_join table?
 

toddburch

macrumors 6502a
Dec 4, 2006
748
0
Katy, Texas
If PostgreSQL allows a select on an insert clause (see blue below), you can populate the book/lang table with a single insert statement. Like this (as a complete example)

Code:
CREATE TABLE BOOK_TABLE                  
(BOOKID   INTEGER NOT NULL,              
AUTHOR    CHAR(20) NOT NULL ,            
PAGES     INTEGER NOT NULL ,             
LANG_CODE CHAR(2) NOT NULL)              
                                          
IN DATABASE TBURCH ;                     
                                         
CREATE TABLE BOOK_LANG                   
                                         
(BOOKID   INTEGER NOT NULL,              
LANG_CODE CHAR(2) NOT NULL)              
IN DATABASE TBURCH ;                     
                                         
INSERT INTO BOOK_TABLE VALUES            
( 001 , 'DR. SEUSS',  40,  'EN') ;       
                                         
INSERT INTO BOOK_TABLE VALUES            
( 002 , 'DR. PHIL' ,  3 ,  'EN') ;       
                                         
INSERT INTO BOOK_TABLE VALUES            
( 003 , 'DR. SPOCK', 20 ,  'FR') ;        
                                         
INSERT INTO BOOK_TABLE VALUES            
( 003 , 'DR. SPOCK', 20 ,  'EN') ;       
                                         
INSERT INTO BOOK_TABLE VALUES            
( 003 , 'DR. SPOCK', 20 ,  'DE') ;        
                                         
COMMIT ;                                 
                                         
[color=blue]INSERT INTO BOOK_LANG (BOOKID, LANG_CODE)
SELECT BOOKID, LANG_CODE FROM BOOK_TABLE ;[/color]
                                          
SELECT *  FROM BOOK_LANG ;
Then, perhaps with PostgreSQL, you can ALTER your table to remove the column you don't want. In DB2, you would have to drop the table and recreate (along with other hoops and loops to carry the data across).
 

ChrisA

macrumors G5
Jan 5, 2006
12,907
2,155
Redondo Beach, California
[classically you would have a "book" table and a "language" table and the third table relating the two. But your language" table wouod have only one column and be usless. So don't bother with it. Now you have just two. One table "book" where you list each book once and one table called book_language that contains just the book_ID and the language code. If a book is in more than one language it applers in book_languages table once per language.

Now it is easy to querry for say, all books in "EN" and the set of languages for a given title.
 

toddburch

macrumors 6502a
Dec 4, 2006
748
0
Katy, Texas
... But your language table would have only one column and be usless...

Well, perhaps not useless. With RI definitions, it could be used by the database manager to validate language codes, ala a domain of valid values.

It's use will be mandated by the needs of the application.

Todd
 

ChrisA

macrumors G5
Jan 5, 2006
12,907
2,155
Redondo Beach, California
Well, perhaps not useless. With RI definitions, it could be used by the database manager to validate language codes.

Correct. validate is useful.

Also, One could add columns to the language table, for example to point to a national flag that could be used as an icon.

When I design something I normally go for the "classic" implementation. So I would have used three tables. The reason for going classic is so I can understand it later or someone after me could.

PostgreSQL has some feaures that would make this problem easy but are non-standard. For example one could use just one table for books and define one column to contain an ARRAY or language codes. Yes I know not standard not "real relational DBMS" but it would work and be a bit faster.

But with a small database, ("small means under a million rows) three tables is fast enough
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.