ページ

Saturday, August 20, 2011

Analyzing Meiryo Font.

Thinking of a font, the font has lots of informations in itself.
and especially, as for vertical writing a.k.a. "縦書き(tategaki)", we might need to know some information on the details of glyph with font architecture, such as typography.
I think this knowledge is helpful for eBook or something like that.

So today, I will post the article about I analyzed a font "meiryo" by WPF application.(C#)

OpenType specification
http://www.microsoft.com/typography/otspec/default.htm


Actually ,it's very easy to find a glyph index and the outline data by using of following methods.

CharacterToGlyphMap.TryGetValue (System.Windows.Media.GlyphTypeface)
GetGlyphOutline (System.Windows.Media.GlyphTypeface)


However, it seems there's no method to find the glyph index of vertical writing. ( as far as I know ).
So I tried scanning binary data of the font file according to the following steps.

For example : How to find vertical writing glyph index about a character 'a' (Unicode: U+0061)

1. Find glyph index about 'a' from cmap.

1-1.cmap Header
     - filtering Encoding Record by EncodingID = 1 or EncodingID = 10 on PlatformID = 3

1-2.CmapFormat4 or CmapFormat12

The glyph index about 'a' is 0x0044 on Meiryo.


2. Find vertical glyph index about 'a' from GSUB

2-1.GSUB Header

2-2.FeatureList
     - filtering by Feature Tag 'vert' or 'vrt2'

2-3.Feature

2-4.LookupList

2-5.Lookup
     - filtering by lookupListIndex on Feature
     - filtering by LookupType = 0x0001

2-6.SingleSubstitutionFormat1 or SingleSubstitutionFormat2

2-7.CoverageFormat1 or CoverageFormat2

The vertical glyph index about 'a' is  0x2793 on Meiryo.








* The application gets glyph data(points data for outline) from glyf ( Simple Glyph Description or Composite Glyph Description )
* And the application draws lines and bezier curves by use of the points.



Composite Glyph


Simple Glyph


Simple Glyph :
multibyte character

About endian

"endian" is important when a program is scanning the binary data, whether "big endian" or "little endian" is right on the computer.

public byte[] GetBytes(byte[] bytes, int startIndex, int length)
{
    if(bytes != null)
    {
        if(index >= 0 && length > 0 && index + length < bytes.Length)
        {
            byte[] bytes0 = new byte[length];
            for(int i=0;i<length;i++)
                bytes0[i] = bytes[startIndex + i];
            if(BitConverter.IsLittleEndian)
                Array.Reverse(bytes0);
            return bytes0;
        }
    }
    throw new Exception("Exception is occurred on GetBytes");
}


then, available for type conversions like this.


public short GetShort(byte[] bytes, int startIndex, int length)
{
	try
	{
		byte[] sBytes = GetBytes(bytes, startIndex, 2);
		return BitConverter.ToInt16(sBytes, 0);
	}
	catch(Exception ex)
	{
		throw new Exception(string.Format("Exception is occurred on GetShort(bytes, {0}, {1}): {2}", startIndex, length, ex.Message));
	}
}


Other tables except for cmap and GSUB have more informations about the font like baseline.

but it's not easy. I mean, I can't be bothered...

No comments:

Post a Comment