Das Blog ist seit geraumer Zeit vollständig auf meine private Domain umgezogen und unter http://andreas-romeyke.de/blog erreichbar.
Zwei Kollegen und die Langzeitarchivierung…
Das Blog ist seit geraumer Zeit vollständig auf meine private Domain umgezogen und unter http://andreas-romeyke.de/blog erreichbar.
Diese Frage ist nicht nur aus versicherungstechnischer Sicht interessant, sondern könnte argumentativ helfen, die Berechtigung für digitale Langzeitarchivierung zu untermauern und Prozessentscheidungen zu versachlichen.
Wie bestimmt man den Wert von etwas, von dem man erst weiß, dass es gebraucht wird, wenn es nicht zugreifbar ist?
Für die Wertermittlung gibt es verschiedene Ansätze. Einer wäre der, für die Kosten einer "Wiederbeschaffung". Für Retrodigitalisate würde dies den Kosten entsprechen, die für eine erneute Digitalisierung entstehen würden. Wenn Bücher im Regal fehlen, dann für wie viel Aufwand man für die Beschaffung ausgeben müsste.
Für unbekannte Dateiformate wären dies die Kosten, die für die Analyse des Dateiformates entstehen würden. In dem Zusammenhang bekommt man manchmal die Aussage zu hören: "Im Zweifel setzen wir da einen Doktorrand dran, der das für ein halbes Jahr untersucht".
Im Fall des Siebeck-Nachlass hatten wir es mit Disketten für eine Panasonic-Schreibmaschine zu tun. Diese zu interpretieren erfordert mindestens die Beschaffung einer solchen Schreibmaschine, die zur Zeit in den einschlägigen Portalen nicht zu bekommen ist. Für das reverse-engineering des Datenformates müsste man sicherlich auch nochmal ein halbes Jahr ansetzen.
Ein anderer Ansatz wäre zu ermitteln, wieviele Arbeiten aufgrund des Verlustes nicht zu dem Objekt entstanden sind. Sprich, man schaut sich an, wieviele Artikel, Forschungsarbeiten etc., sich mit einem digitalen Objekt beschäftigen. Gäbe es dieses Objekt nicht, gäbe es auch die darauf aufbauenden Arbeiten nicht.
Aus den vorherigen Überlegungen ergibt sich, dass ein Objekt um so wertvoller ist, desto seltener es ist. Dabei ist "selten" Ausdruck von fehlender Redundanz.
Wenn ein Buch in einer Auflage von 10.000 Stück mal 10€ gekostet hat, und jetzt nur noch 2 Exemplare existieren, kann man dann
10€/2 * 10000 = 50000€
rechnen?
Hmmm. Sind diese Ansätze sinnvoll? So richtig nicht. In der Praxis würde man sich vielleicht typische Vertreter bestimmter Objekttypgruppen herausziehen und einen Wert bestimmen.
Wie auch immer, wenn jemand eine Idee hat, wie man es besser machen könnte, immer her damit!
In the last blog post we considered what the basic structure of the information packages should look like and how we will deal with versioning. In the following I would like to describe further cornerstones of a minimalistic archival information system. These will then form the basis for a first implementation, which would go beyond the scope of this blog. As soon as there is news worth reporting, I will announce it here.
Other archival information systems sometimes make it too easy for themselves and use a database to manage information about the AIPs in the archive. In principle there is nothing wrong with this, but it often seems that it is forgotten that a basic principle of information packages is the intellectual unit (IE) of data and metadata. What does this mean? The idea is that an IE should be able to stand on its own at all times. Following this principle has two consequences. First, hierarchically nested IEs cannot exist unless self-contained IEs are encapsulated like a box of boxes. In other words: IEs that only contain references to other IEs are not possible because they would not be viable on their own.
The second consequence is that all metadata must always be in a consistent state, regardless of the state of the archive information system. In other words, there must be no contradictions between the information stored in the AIP and the information in the system's database.
Why is this important? The Archival Information Packages are ultimately the time capsules that will outlast the Archive. If everything breaks, but a copy of an AIP is still found on tape, it contained all the information needed to interpret the data to be preserved.
So for the management of the AIPs we define the following:
What has proven to be very helpful is the following:
A minimalist archival information system (MAIS) should have the following properties:
I plan to tackle the programming in the coming weeks and months. I will probably not go into detail about the individual steps of programming here. As soon as there is something presentable, I'll let you know. Otherwise let me know what your experiences are, which details are important to you with an AIS, especially if it should be particularly lightweight.
In the last post I explained some basic terms. Now it's time for the real thing.
The first question, what should the information packages (SIP, AIP, DIP) look like? It is important that they are easy to process, easy to understand and easy to expand. Fortunately, there is RFC8493 that has the solution ready for us: BagIt.
In the last post I already mentioned that we have to think about the topic of versioning of AIPs. Not only because of the metadata or AIP updates, but also in the case of a PP&A, i.e. format migration. A simple idea is to introduce linked lists.
This allows us to easily implement the functionality of rolling back an AIP version as well.
A new AIP points to the predecessor in which the new version receives a reference entry in the "bag-info.txt":
The last two keys are optional and only needed if AIP-AIP-Transfer is needed to move digital objects from one archival information system to another.
Many of those who are dealing with the digital preservation of objects for the first time and who work in small memory organizations are often helpless in the face of the vast range of functions and requirements of current archival information systems.
Students of library or archival science often appear to be similarly overwhelmed when they are supposed to learn what constitutes archival software.
This has motivated me to write down thoughts on a minimalist archive information system. Because it really doesn't need much.
An archive essentially has three roles: the submitter, called the producer, the user, also called the consumer, and the problem solver who maintains the archive, also called the technical analyst.
When digital objects are transferred to the archive, it is called the ingest process. When they are requested from the archive, then this is the access process.
The digital objects to be preserved are provided with all the necessary information for the archive ingest and are packaged in a predefined structure. This is called a Submission Information Package (SIP). You can actually imagine this just like in real life. For example, if you want to store a vase, you put it in a box, label it and put it on a shelf.
In the archive it is checked whether (allegorically) the vase is in the box and intact, and if there is a stamp and signature that says that the content of the package is indeed a vase. A file number and a storage location is assigned and the box goes sealed and neatly labeled on the shelf. The "box" is called Archival Information Package (AIP). With the seal, the archive takes responsibility.
At some point, when the user would like to see the vase from the archive again, the archive would process the request and send the vase and accompanying information to the user. This is then called a Dissemination Information Package (DIP).
In addition to this simple "I store something safely and retrieve it again at some point" approach, an archive fulfills another task that is not so obvious: it ensures that objects entrusted to it are kept usable.
What does that mean in the digital world?
If it is possible in principle to store a digital object securely with bit accuracy, even over a very long period of time (bitstream archival), it still can age because the environment for using this object is no longer available.
There are essentially three concepts for keeping digital objects usable (content preservation): hardware museum, emulation or format migration.
Hardware museums (e.g. a slot machine museum) try to keep old equipment running in a controlled environment. To do that, they have to build up a stock in time and build up knowledge on how to maintain and repair these devices.
With an emulation, I try to recreate the environment for the digital object so that it feels at home and doesn't notice any difference from the previous, real world. A very good example of emulators is e.g. MAME, but also various others, the e.g. retro computers like the Amiga or C64, so that old programs from their time can run on them. Here, too, I need knowledge about what the environment to be emulated looks like and how I can recreate it with today's means.
When migrating the format, I try to find a new form that retains the essential properties (significant properties) in good time and to transfer files from a digital object to a newer data format.
From this point onwards, it is assumed that this is the preferred way of maintaining usability.
It follows that an Archival Information System (AIS) must be able to support this process of format migration. The process (also called Preservation Planning and Action) results in a new version of the Archival Information Package being created. The AIS must be able to manage this.
That would basically be all there is to Archival Information Systems if it hadn't been for the librarians.
Unlike archivists, where a record is complete and closed, librarians understand the concepts of supplements and metadata submissions. A page that has fallen out has turned up here, a letter has been discovered there in an estate, or it has dawned on some people that there is now money for costly in-depth indexing. Ergo, librarians expect people to think about how to handle metadata and data updates on existing AIPs (called metadata update and AIP update). This is not trivial, since some AIPs are also very large and you want to avoid pointless copying. For such an update, we also need a good way for producers to tell the archive which AIP needs to be added or updated.
However, AIPs are already versioned in the case of format migration, the same can be used here as well. Any change to the AIP creates a new version of an AIP. And so that you can't accidentally break anything, you should always be able to go back to an old version. And since that is also error-prone, the result of the rollback process will simply be a new version.
That's it. It's nothing more. Easy, isn't it?
BagIt (RFC 8493) forms the basis for Submission Information Packages (SIP) and Archival Information Packages (AIP) in many digital archives.
Especially in the library environment, it is necessary to support supplemental submissions in the Archival Information System (AIS) software. Supplements may be limited to metadata or may add new files, remove existing files, or replace existing files.
Unfortunately, there is no way to implement a differential SIP cleanly and easily in the BagIt specification.
A design of a differential BagIt (dBagIt) should meet the following conditions:
1. existing BagIt should not be touched
2. it should be based on the BagIt structure so that the conversion effort is minimal
3. it should be easy to implement
4. it should support the "add" and "delete" operations
5. the checksum protection should be guaranteed
6. the referenced bag should be specified explicitly
The basis is the structure of BagIt. The following are the changes that are mandatory.
In contrast to 2.1.1 of RFC8493 the filename is dbagit.txt
In contrast to 2.1.3 of RFC8493 each line of a payload manifest file MUST be of the form
sign checksum filepath
where sign is either + for adding a file or - for deleting a file.
The replacement of files is simulated by one entry each for deleting and adding.
Additional to RFC8493 the key Updates-External-Identifier becomes mandatory. It is used to reference to the original data object, which will be updated by this dBagIt.
The Tag Manifest is similar to RFC8493.
Although tag manifest files in BagIt could be used to describe additional proprietary subdirectories of a bag not specified in the RFC, it is not defined here to support changes as in the previous section on payload manifest. This facilitates the creation and processing of dBagIts.
The implementation must ensure that:
If there is interest, I would be happy to receive feedback via art1pirat ATgmail.com. Maybe a new RFC can grow out of it.
A very simple solution could also be the use of unified 'diff'. This also allows partial changes in files, but would hardly bring any advantages with binary data and is not quite as intuitive for users who are not familiar with IT.
In the first part I described how I came to know how to read the floppy disks (using kryoflux). Now I would like to give an intermediate state about the floppy disk format of the Panasonic typewriter - in the quiet hope that someone could uncover the last secret.
I found the most important clue while researching a successor model - the Panasonc KX-W1000. I stumbled across the follow old blog post https://surrey.lug.org.uk/panasonic-kx-w1000.
Even if it didn't lead to a full success, there were some interesting insights. The floppy image is strongly related to FAT12.
Here is my summary.
The filesystem is based on FAT12 with proprietary extensions.
The first bytes are: 0x00 00 00 4B 58 2D 57 31 35 31 30 20 31 2E 30 30 20, which corresponds to the string "KX-W1510 1.00" from the third byte onwards.
The first 256 bytes are very similar to a MBR of old DOS floppies:
0000:0000 | 00 00 00 4B 58 2D 57 31 35 31 30 20 31 2E 30 30 | ...KX-W1510 1.00 0000:0010 | 20 F9 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ù.............. 0000:0020 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0030 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0040 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0050 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0060 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0070 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0080 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:0090 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00A0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00B0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00C0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00D0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00E0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ 0000:00F0 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
There are two equal blocks which probably represent FATs, once at address 0x200:
0000:0200 | F9 FF FF 03 40 00 05 B0 00 07 80 00 09 A0 00 FF | ùÿÿ.@..°..... .ÿ 0000:0210 | FF FF 0D E0 00 0F 00 01 FF 8F 01 13 40 01 15 60 | ÿÿ.à....ÿ...@..` 0000:0220 | 01 17 F0 FF 19 90 02 1B 10 02 1D E0 01 1F 00 02 | ..ðÿ.......à.... 0000:0230 | FF 2F 02 23 F0 FF 25 60 02 2D 80 02 2C A0 02 2B | ÿ/.#ðÿ%`.-.., .+ 0000:0240 | F0 FF FF EF 02 35 00 03 31 20 03 33 F0 FF 36 F0 | ðÿÿï.5..1 .3ðÿ6ð 0000:0250 | FF 37 80 03 39 F0 FF 3B C0 03 3D E0 03 FF 0F 04 | ÿ7..9ðÿ;À.=à.ÿ.. 0000:0260 | 41 20 04 43 F0 FF 45 60 04 47 80 04 49 F0 FF 4B | A .CðÿE`.G..IðÿK 0000:0270 | C0 04 4D E0 04 4F F0 FF 51 20 05 53 40 05 55 F0 | À.Mà.OðÿQ .S@.Uð 0000:0280 | FF 57 80 05 59 A0 05 5B F0 FF 5D E0 05 69 B0 07 | ÿW..Y .[ðÿ]à.i°. 0000:0290 | 7F 20 06 63 40 06 65 F0 FF 6E 80 06 6B A0 06 FF | . .c@.eðÿn..k .ÿ 0000:02A0 | CF 06 6D F0 FF 7E 00 07 71 20 07 73 40 07 FF 6F | Ï.mðÿ~..q .s@.ÿo 0000:02B0 | 07 77 80 07 79 A0 07 FF CF 07 7D 60 08 80 30 08 | .w..y .ÿÏ.}`..0. 0000:02C0 | 81 20 08 FF 4F 08 85 80 08 87 F0 FF FF AF 08 8B | . .ÿO.....ðÿÿ¯.. 0000:02D0 | F0 08 8D E0 08 90 20 09 91 40 09 93 F0 FF FF 6F | ð..à.. ..@..ðÿÿo 0000:02E0 | 09 A2 F0 09 99 A0 09 9B C0 09 9D F0 FF A1 00 0A | .¢ð.. ..À..ðÿ¡.. 0000:02F0 | B2 60 0A A3 40 0A A5 F0 FF AC 80 0A A9 A0 0A AB | ²`.£@.¥ðÿ¬..© .« 0000:0300 | F0 FF AD E0 0A AF 00 0B B1 B0 0B BA 40 0B B5 C0 | ðÿ.à.¯..±°.º@.µÀ 0000:0310 | 0C B7 80 0B B9 60 0C C3 C0 0B BD E0 0B BF 00 0C | .·..¹`.ÃÀ.½à.¿.. 0000:0320 | C1 20 0C CA 40 0C C5 80 0C C7 90 0C E8 00 0D CB | Á .Ê@.Å..Ç..è..Ë 0000:0330 | F0 FF CD E0 0C CF B0 0D D1 50 0D D3 40 0D DE F0 | ðÿÍà.ϰ.ÑP.Ó@.Þð 0000:0340 | FF D7 E0 0E D9 70 0E E2 C0 0D DD F0 FF DF 00 0E | ÿ×à.Ùp.âÀ.Ýðÿß.. 0000:0350 | E1 50 0E E3 40 0E E6 90 0E FB C0 0E FF AF 0E EB | áP.ã@.æ..ûÀ.ÿ¯.ë 0000:0360 | F0 FF ED 60 0F EF 00 0F F1 20 0F F3 40 0F F5 F0 | ðÿí`.ï..ñ .ó@.õð 0000:0370 | FF F7 80 0F F9 A0 0F FF CF 0F FD E0 0F 08 01 00 | ÿ÷..ù .ÿÏ.ýà....
once at 0x800:
0000:0800 | F9 FF FF 03 40 00 05 B0 00 07 80 00 09 A0 00 FF | ùÿÿ.@..°..... .ÿ 0000:0810 | FF FF 0D E0 00 0F 00 01 FF 8F 01 13 40 01 15 60 | ÿÿ.à....ÿ...@..` 0000:0820 | 01 17 F0 FF 19 90 02 1B 10 02 1D E0 01 1F 00 02 | ..ðÿ.......à.... 0000:0830 | FF 2F 02 23 F0 FF 25 60 02 2D 80 02 2C A0 02 2B | ÿ/.#ðÿ%`.-.., .+ 0000:0840 | F0 FF FF EF 02 35 00 03 31 20 03 33 F0 FF 36 F0 | ðÿÿï.5..1 .3ðÿ6ð 0000:0850 | FF 37 80 03 39 F0 FF 3B C0 03 3D E0 03 FF 0F 04 | ÿ7..9ðÿ;À.=à.ÿ.. 0000:0860 | 41 20 04 43 F0 FF 45 60 04 47 80 04 49 F0 FF 4B | A .CðÿE`.G..IðÿK 0000:0870 | C0 04 4D E0 04 4F F0 FF 51 20 05 53 40 05 55 F0 | À.Mà.OðÿQ .S@.Uð 0000:0880 | FF 57 80 05 59 A0 05 5B F0 FF 5D E0 05 69 B0 07 | ÿW..Y .[ðÿ]à.i°. 0000:0890 | 7F 20 06 63 40 06 65 F0 FF 6E 80 06 6B A0 06 FF | . .c@.eðÿn..k .ÿ 0000:08A0 | CF 06 6D F0 FF 7E 00 07 71 20 07 73 40 07 FF 6F | Ï.mðÿ~..q .s@.ÿo 0000:08B0 | 07 77 80 07 79 A0 07 FF CF 07 7D 60 08 80 30 08 | .w..y .ÿÏ.}`..0. 0000:08C0 | 81 20 08 FF 4F 08 85 80 08 87 F0 FF FF AF 08 8B | . .ÿO.....ðÿÿ¯.. 0000:08D0 | F0 08 8D E0 08 90 20 09 91 40 09 93 F0 FF FF 6F | ð..à.. ..@..ðÿÿo 0000:08E0 | 09 A2 F0 09 99 A0 09 9B C0 09 9D F0 FF A1 00 0A | .¢ð.. ..À..ðÿ¡.. 0000:08F0 | B2 60 0A A3 40 0A A5 F0 FF AC 80 0A A9 A0 0A AB | ²`.£@.¥ðÿ¬..© .« 0000:0900 | F0 FF AD E0 0A AF 00 0B B1 B0 0B BA 40 0B B5 C0 | ðÿ.à.¯..±°.º@.µÀ 0000:0910 | 0C B7 80 0B B9 60 0C C3 C0 0B BD E0 0B BF 00 0C | .·..¹`.ÃÀ.½à.¿.. 0000:0920 | C1 20 0C CA 40 0C C5 80 0C C7 90 0C E8 00 0D CB | Á .Ê@.Å..Ç..è..Ë 0000:0930 | F0 FF CD E0 0C CF B0 0D D1 50 0D D3 40 0D DE F0 | ðÿÍà.ϰ.ÑP.Ó@.Þð 0000:0940 | FF D7 E0 0E D9 70 0E E2 C0 0D DD F0 FF DF 00 0E | ÿ×à.Ùp.âÀ.Ýðÿß.. 0000:0950 | E1 50 0E E3 40 0E E6 90 0E FB C0 0E FF AF 0E EB | áP.ã@.æ..ûÀ.ÿ¯.ë 0000:0960 | F0 FF ED 60 0F EF 00 0F F1 20 0F F3 40 0F F5 F0 | ðÿí`.ï..ñ .ó@.õð 0000:0970 | FF F7 80 0F F9 A0 0F FF CF 0F FD E0 0F 08 01 00 | ÿ÷..ù .ÿÏ.ýà.... 0000:0980 | 00 00 00 00 00 00 00 00 00 00 00 00 FF 0F 00 00 | ............ÿ...
The main directory always starts from address 0xe00:
0000:0E00 | 20 20 20 20 20 20 44 49 5B 54 20 FF 00 00 00 00 | DI[T ÿ.... 0000:0E10 | 00 00 00 00 00 00 06 00 21 00 02 00 F5 13 00 00 | ........!...õ... 0000:0E20 | 20 20 20 20 20 20 41 46 46 45 20 FF 00 00 00 00 | AFFE ÿ.... 0000:0E30 | 00 00 00 00 00 00 06 00 21 00 06 00 F6 13 00 00 | ........!...ö... 0000:0E40 | 20 20 20 54 52 5D 46 46 45 4C 20 FF 00 00 00 00 | TR]FFEL ÿ.... 0000:0E50 | 00 00 00 00 00 00 06 00 21 00 0C 00 C4 13 00 00 | ........!...Ä... 0000:0E60 | 20 20 45 52 42 50 52 49 4E 5A 20 FF 00 00 00 00 | ERBPRINZ ÿ.... 0000:0E70 | 00 00 00 00 00 00 06 00 21 00 11 00 17 14 00 00 | ........!....... 0000:0E80 | 20 20 20 20 42 49 53 54 52 4F 20 FF 00 00 00 00 | BISTRO ÿ.... 0000:0E90 | 00 00 00 00 00 00 06 00 21 00 12 00 61 14 00 00 | ........!...a... 0000:0EA0 | 20 20 20 48 55 48 4E 20 49 49 20 FF 00 00 00 00 | HUHN II ÿ.... 0000:0EB0 | 00 00 00 00 00 00 06 00 21 00 1C 00 CC 13 00 00 | ........!...Ì... 0000:0EC0 | 20 20 20 20 57 41 43 48 41 55 20 FF 00 00 00 00 | WACHAU ÿ.... 0000:0ED0 | 00 00 00 00 00 00 06 00 21 00 1A 00 C0 13 00 00 | ........!...À... 0000:0EE0 | 20 20 20 20 20 4B 41 4B 41 4F 20 FF 00 00 00 00 | KAKAO ÿ.... 0000:0EF0 | 00 00 00 00 00 00 06 00 21 00 24 00 1D 14 00 00 | ........!.$..... 0000:0F00 | 20 20 20 20 20 20 4D 5D 4C 4C 20 FF 00 00 00 00 | M]LL ÿ.... 0000:0F10 | 00 00 00 00 00 00 06 00 21 00 27 00 22 0B 00 00 | ........!.'."... 0000:0F20 | 20 46 52 41 55 20 4D 4F 44 45 20 FF 00 00 00 00 | FRAU MODE ÿ.... 0000:0F30 | 00 00 00 00 00 00 06 00 21 00 2F 00 AC 13 00 00 | ........!./.¬... 0000:0F40 | 20 20 20 20 53 55 50 50 45 4E 20 FF 00 00 00 00 | SUPPEN ÿ.... 0000:0F50 | 00 00 00 00 00 00 06 00 21 00 34 00 C7 13 00 00 | ........!.4.Ç... 0000:0F60 | 55 4E 53 45 52 20 42 52 4F 54 20 FF 00 00 00 00 | UNSER BROT ÿ.... 0000:0F70 | 00 00 00 00 00 00 06 00 21 00 3A 00 B8 13 00 00 | ........!.:.¸... 0000:0F80 | 20 20 20 20 20 20 31 39 39 34 20 FF 00 00 00 00 | 1994 ÿ.... 0000:0F90 | 00 00 00 00 00 00 06 00 21 00 3F 00 AA 13 00 00 | ........!.?.ª... 0000:0FA0 | 20 20 20 20 20 4B 5D 43 48 45 20 FF 00 00 00 00 | K]CHE ÿ.... 0000:0FB0 | 00 00 00 00 00 00 06 00 21 00 44 00 3D 14 00 00 | ........!.D.=... 0000:0FC0 | 20 55 43 4B 45 52 4D 41 52 4B 20 FF 00 00 00 00 | UCKERMARK ÿ.... 0000:0FD0 | 00 00 00 00 00 00 06 00 21 00 4A 00 2B 14 00 00 | ........!.J.+... 0000:0FE0 | 20 20 52 49 45 53 4C 49 4E 47 20 FF 00 00 00 00 | RIESLING ÿ.... 0000:0FF0 | 00 00 00 00 00 00 06 00 21 00 50 00 31 14 00 00 | ........!.P.1... 0000:1000 | 43 48 49 4E 41 54 52 5D 46 46 20 FF 00 00 00 00 | CHINATR]FF ÿ.... 0000:1010 | 00 00 00 00 00 00 06 00 21 00 56 00 25 14 00 00 | ........!.V.%... 0000:1020 | 20 4B 5B 53 45 52 45 53 54 45 20 FF 00 00 00 00 | K[SERESTE ÿ.... 0000:1030 | 00 00 00 00 00 00 06 00 21 00 5C 00 E3 12 00 00 | ........!.\.ã... 0000:1040 | 4B 41 54 5A 45 4E 46 55 54 54 20 FF 00 00 00 00 | KATZENFUTT ÿ.... 0000:1050 | 00 00 00 00 00 00 06 00 21 00 61 00 CC 12 00 00 | ........!.a.Ì... 0000:1060 | 20 20 52 4F 42 55 43 48 4F 4E 20 FF 00 00 00 00 | ROBUCHON ÿ.... 0000:1070 | 00 00 00 00 00 00 06 00 21 00 5F 00 55 14 00 00 | ........!._.U... 0000:1080 | 20 20 20 4D 41 4E 41 47 45 52 20 FF 00 00 00 00 | MANAGER ÿ.... 0000:1090 | 00 00 00 00 00 00 06 00 21 00 67 00 FC 13 00 00 | ........!.g.ü... 0000:10A0 | 20 20 4D 49 43 48 45 4C 49 4E 20 FF 00 00 00 00 | MICHELIN ÿ.... 0000:10B0 | 00 00 00 00 00 00 06 00 21 00 6F 00 8C 14 00 00 | ........!.o..... 0000:10C0 | 20 20 50 49 4D 45 4E 54 4F 53 20 FF 00 00 00 00 | PIMENTOS ÿ.... 0000:10D0 | 00 00 00 00 00 00 06 00 21 00 75 00 14 14 00 00 | ........!.u..... 0000:10E0 | 54 48 4F 4D 41 53 4D 41 4E 4E 20 FF 00 00 00 00 | THOMASMANN ÿ.... 0000:10F0 | 00 00 00 00 00 00 06 00 21 00 66 00 20 14 00 00 | ........!.f. ... 0000:1100 | 20 20 38 2D 4D 41 49 2D 34 35 20 FF 00 00 00 00 | 8-MAI-45 ÿ.... 0000:1110 | 00 00 00 00 00 00 06 00 21 00 60 00 2A 14 00 00 | ........!.`.*... 0000:1120 | 20 20 43 4F 51 41 55 56 49 4E 20 FF 00 00 00 00 | COQAUVIN ÿ.... 0000:1130 | 00 00 00 00 00 00 06 00 21 00 89 00 0B 14 00 00 | ........!....... 0000:1140 | 20 47 55 44 45 20 53 54 55 42 20 FF 00 00 00 00 | GUDE STUB ÿ.... 0000:1150 | 00 00 00 00 00 00 06 00 21 00 8C 00 A0 14 00 00 | ........!... ... 0000:1160 | 20 20 4D 4F 4E 54 43 41 55 44 20 FF 00 00 00 00 | MONTCAUD ÿ.... 0000:1170 | 00 00 00 00 00 00 06 00 21 00 95 00 63 15 00 00 | ........!...c... 0000:1180 | 20 53 50 41 52 47 45 4C 45 49 20 FF 00 00 00 00 | SPARGELEI ÿ.... 0000:1190 | 00 00 00 00 00 00 06 00 21 00 98 00 BD 14 00 00 | ........!...½... 0000:11A0 | 53 45 4D 49 42 45 4C 47 49 45 20 FF 00 00 00 00 | SEMIBELGIE ÿ.... 0000:11B0 | 00 00 00 00 00 00 06 00 21 00 97 00 B3 25 00 00 | ........!...³%.. 0000:11C0 | 20 53 45 4D 49 4E 41 52 39 35 20 FF 00 00 00 00 | SEMINAR95 ÿ.... 0000:11D0 | 00 00 00 00 00 00 06 00 21 00 9E 00 C9 4B 00 00 | ........!...ÉK.. 0000:11E0 | 20 20 54 41 4E 54 41 4C 55 53 20 FF 00 00 00 00 | TANTALUS ÿ.... 0000:11F0 | 00 00 00 00 00 00 06 00 21 00 A7 00 DC 12 00 00 | ........!.§.Ü...
In contrast to FAT12 each directory entry consists of 10bytes for the file name, left padded with Spaces. Umlauts in filenames are possible (see below). A filename suffix does not exist. This corresponds with the findings in the typewriter manual.
Sometimes there is a special directory at Offset 0x100, this could hold the adress-lists or dictionaries:
0000:0100 | 20 20 20 20 57 41 53 53 45 52 20 FF 00 00 00 00 | WASSER ÿ.... 0000:0110 | 00 00 00 00 00 00 06 00 21 00 48 00 36 0A 00 00 | ........!.H.6... 0000:0120 | 20 20 20 20 20 4B 5D 43 48 45 20 FF 00 00 00 00 | K]CHE ÿ.... 0000:0130 | 00 00 00 00 00 00 06 00 21 00 49 00 3D 14 00 00 | ........!.I.=... 0000:0140 | 20 20 20 41 55 53 54 45 52 4E 20 FF 00 00 00 00 | AUSTERN ÿ.... 0000:0150 | 00 00 00 00 00 00 06 00 21 00 4E 00 D2 0A 00 00 | ........!.N.Ò... 0000:0160 | 20 20 20 20 54 52 5B 55 4D 45 20 FF 00 00 00 00 | TR[UME ÿ.... 0000:0170 | 00 00 00 00 00 00 06 00 21 00 50 00 59 25 00 00 | ........!.P.Y%.. 0000:0180 | 20 20 52 45 43 48 4E 55 4E 47 20 FF 00 00 00 00 | RECHNUNG ÿ.... 0000:0190 | 00 00 00 00 00 00 06 00 21 00 54 00 C3 08 00 00 | ........!.T.Ã... 0000:01A0 | 20 20 20 20 20 48 45 4E 52 59 20 FF 00 00 00 00 | HENRY ÿ.... 0000:01B0 | 00 00 00 00 00 00 06 00 21 00 59 00 66 11 00 00 | ........!.Y.f... 0000:01C0 | 53 43 48 57 41 52 5A 41 44 4C 20 FF 00 00 00 00 | SCHWARZADL ÿ.... 0000:01D0 | 00 00 00 00 00 00 06 00 21 00 57 00 94 0A 00 00 | ........!.W..... 0000:01E0 | 20 50 4C 41 43 48 55 54 54 41 20 FF 00 00 00 00 | PLACHUTTA ÿ.... 0000:01F0 | 00 00 00 00 00 00 06 00 21 00 5C 00 49 09 00 00 | ........!.\.I...
But sometimes there are textfragments (from other floppy):
0000:0100 | 64 20 73 63 68 E9 64 6C 69 63 68 21 C9 20 20 20 | d schédlich!É 0000:0110 | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | 0000:0120 | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | 0000:0130 | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | 0000:0140 | 55 64 6F 20 50 6F 6C 6C 6D 65 72 2C 20 65 69 6E | Udo Pollmer, ein 0000:0150 | 20 4C 65 62 65 6E 73 6D 69 74 74 65 6C 63 68 65 | Lebensmittelche 0000:0160 | 6D 69 6B 65 72 20 75 6E 64 20 65 72 66 6F 6C 67 | miker und erfolg 0000:0170 | 72 65 69 63 68 65 72 20 20 20 20 20 20 20 20 20 | reicher 0000:0180 | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | 0000:0190 | 46 61 63 68 62 75 63 68 61 75 74 6F 72 20 68 61 | Fachbuchautor ha 0000:01A0 | 74 20 69 6E 20 65 69 6E 65 6D 20 5A 65 69 74 75 | t in einem Zeitu 0000:01B0 | 6E 67 73 69 6E 74 65 72 76 69 65 77 20 65 72 6B | ngsinterview erk 0000:01C0 | 6C E9 72 74 3A 20 20 20 20 20 20 20 20 20 20 20 | lért: 0000:01D0 | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | 0000:01E0 | 22 44 69 E9 74 65 6E 20 6D 61 63 68 65 6E 20 64 | "Diéten machen d 0000:01F0 | 69 63 6B 22 2E 20 57 65 69 6C 20 64 65 72 20 4B | ick". Weil der K
Umlauts and Special chars are mapped as follows:
ä → 0x7b
ö → 0x7c
ü → 0x7d
Ä → 0x5b
Ö → 0x5c
Ü → 0x5d
ß → 0x85
hyphen → 0xbc
What is still completely unclear is how the FATs are constructed. They do look like FAT12 entries, the first bytes 0xf9 0xff 0x03... and the frequently occurring 0xff suggest this, yet there seems to be no connection between the addresses of the text fragments in the image and the FAT byte sequences.
In the directory entries everything points to the fact that byte 26 indicates the start cluster and bytes 28-29 the file size, the connection with the FAT and the actual offset (or cluster) to the data I could not decipher yet.
The meaning of offset 0x100 is unclear.
If you have any ideas how to read the FATs, or how to interpret the bytes 26, 28-29 of the directory entries, or what the cluster size should be, feel free to write me.
If you are the owner of such an old typewriter, it would be helpful to have a clean-room floppy copy, i.e. a freshly formatted floppy with a small test text, so that I can reverse engineer the data format even better.
Just contact me at art1piratatgoogledotcom
https://archive.org/details/MSXTechnicalDataBook/page/n269/mode/2up
https://manualsbrain.com/ja/products/panasonic-kx-w1510/
my thanks goes to