extras/makecasefoldhashtable.pl
author Ryan C. Gordon <icculus@icculus.org>
Fri, 11 Aug 2017 01:39:22 -0400
changeset 1555 5495a0e50b5e
parent 1373 527ef3c6a2d6
permissions -rwxr-xr-x
utf8: big improvements to case-insensitive UTF-8 string compare. - Dramatically reduce RAM usage: uses between 8 and 11 kilobytes less static memory for its internal case-folding tables. - Actually works now. It would fail unconditionally if a codepoint folded into multiple codepoints, even if the compared string contained those exact codepoints. - Now a public API! - Removed __PHYSFS_utf8strnicmp(): nothing was using it, it was incorrect anyhow, and what does 'n' represent when either string might case-fold to something larger in-flight, anyhow?
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
     1
#!/usr/bin/perl -w
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
     2
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
     3
use warnings;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
     4
use strict;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
     5
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
     6
my $HASHBUCKETS1_16 = 256;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
     7
my $HASHBUCKETS1_32 = 16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
     8
my $HASHBUCKETS2_16 = 16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
     9
my $HASHBUCKETS3_16 = 4;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    10
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    11
print <<__EOF__;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    12
/*
1373
527ef3c6a2d6 HTTPS all the things.
Ryan C. Gordon <icculus@icculus.org>
parents: 828
diff changeset
    13
 * This file is part of PhysicsFS (https://icculus.org/physfs/)
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    14
 *
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    15
 * This data generated by physfs/extras/makecasefoldhashtable.pl ...
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    16
 * Do not manually edit this file!
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    17
 *
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    18
 * Please see the file LICENSE.txt in the source's root directory.
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    19
 */
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    20
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    21
#ifndef _INCLUDE_PHYSFS_CASEFOLDING_H_
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    22
#define _INCLUDE_PHYSFS_CASEFOLDING_H_
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    23
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    24
#ifndef __PHYSICSFS_INTERNAL__
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    25
#error Do not include this header from your applications.
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    26
#endif
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    27
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    28
/* We build three simple hashmaps here: one that maps Unicode codepoints to
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    29
a one, two, or three lowercase codepoints. To retrieve this info: look at
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    30
case_fold_hashX, where X is 1, 2, or 3. Most foldable codepoints fold to one,
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    31
a few dozen fold to two, and a handful fold to three. If the codepoint isn't
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    32
in any of these hashes, it doesn't fold (no separate upper and lowercase).
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    33
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    34
Almost all these codepoints fit into 16 bits, so we hash them as such to save
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    35
memory. If a codepoint is > 0xFFFF, we have separate hashes for them,
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    36
since there are (currently) only about 120 of them and (currently) all of them
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    37
map to a single lowercase codepoint. */
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    38
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    39
typedef struct CaseFoldMapping1_32
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    40
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    41
    PHYSFS_uint32 from;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    42
    PHYSFS_uint32 to0;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    43
} CaseFoldMapping1_32;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    44
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    45
typedef struct CaseFoldMapping1_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    46
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    47
    PHYSFS_uint16 from;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    48
    PHYSFS_uint16 to0;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    49
} CaseFoldMapping1_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    50
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    51
typedef struct CaseFoldMapping2_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    52
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    53
    PHYSFS_uint16 from;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    54
    PHYSFS_uint16 to0;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    55
    PHYSFS_uint16 to1;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    56
} CaseFoldMapping2_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    57
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    58
typedef struct CaseFoldMapping3_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    59
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    60
    PHYSFS_uint16 from;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    61
    PHYSFS_uint16 to0;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    62
    PHYSFS_uint16 to1;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    63
    PHYSFS_uint16 to2;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    64
} CaseFoldMapping3_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    65
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    66
typedef struct CaseFoldHashBucket1_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    67
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    68
    const CaseFoldMapping1_16 *list;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    69
    const PHYSFS_uint8 count;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    70
} CaseFoldHashBucket1_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    71
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    72
typedef struct CaseFoldHashBucket1_32
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    73
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    74
    const CaseFoldMapping1_32 *list;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    75
    const PHYSFS_uint8 count;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    76
} CaseFoldHashBucket1_32;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    77
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    78
typedef struct CaseFoldHashBucket2_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    79
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    80
    const CaseFoldMapping2_16 *list;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    81
    const PHYSFS_uint8 count;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    82
} CaseFoldHashBucket2_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    83
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    84
typedef struct CaseFoldHashBucket3_16
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    85
{
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    86
    const CaseFoldMapping3_16 *list;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    87
    const PHYSFS_uint8 count;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    88
} CaseFoldHashBucket3_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    89
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    90
__EOF__
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    91
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
    92
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    93
my @foldPairs1_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    94
my @foldPairs2_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    95
my @foldPairs3_16;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    96
my @foldPairs1_32;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    97
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    98
for (my $i = 0; $i < $HASHBUCKETS1_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
    99
    $foldPairs1_16[$i] = '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   100
}
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   101
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   102
for (my $i = 0; $i < $HASHBUCKETS1_32; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   103
    $foldPairs1_32[$i] = '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   104
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   105
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   106
for (my $i = 0; $i < $HASHBUCKETS2_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   107
    $foldPairs2_16[$i] = '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   108
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   109
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   110
for (my $i = 0; $i < $HASHBUCKETS3_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   111
    $foldPairs3_16[$i] = '';
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   112
}
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   113
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   114
open(FH,'<','casefolding.txt') or die("failed to open casefolding.txt: $!\n");
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   115
while (<FH>) {
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   116
    chomp;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   117
    # strip comments from textfile...
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   118
    s/\#.*\Z//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   119
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   120
    # strip whitespace...
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   121
    s/\A\s+//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   122
    s/\s+\Z//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   123
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   124
    next if not /\A([a-fA-F0-9]+)\;\s*(.)\;\s*(.+)\;/;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   125
    my ($code, $status, $mapping) = ($1, $2, $3);
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   126
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   127
    my $hexxed = hex($code);
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   128
    #print("// code '$code'   status '$status'   mapping '$mapping'\n");
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   129
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   130
    if (($status eq 'C') or ($status eq 'F')) {
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   131
        my ($map1, $map2, $map3) = (undef, undef, undef);
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   132
        $map1 = $1 if $mapping =~ s/\A([a-fA-F0-9]+)(\s*|\Z)//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   133
        $map2 = $1 if $mapping =~ s/\A([a-fA-F0-9]+)(\s*|\Z)//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   134
        $map3 = $1 if $mapping =~ s/\A([a-fA-F0-9]+)(\s*|\Z)//;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   135
        die("mapping space too small for '$code'\n") if ($mapping ne '');
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   136
        die("problem parsing mapping for '$code'\n") if (not defined($map1));
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   137
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   138
        if ($hexxed < 128) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   139
            # Just ignore these, we'll handle the low-ASCII ones ourselves.
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   140
        } elsif ($hexxed > 0xFFFF) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   141
            # We just need to add the 32-bit 2 and/or 3 codepoint maps if this die()'s here.
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   142
            die("Uhoh, a codepoint > 0xFFFF that folds to multiple codepoints! Fixme.") if defined($map2);
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   143
            my $hashed = (($hexxed ^ ($hexxed >> 8)) & ($HASHBUCKETS1_32-1));
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   144
            #print("// hexxed '$hexxed'  hashed1 '$hashed'\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   145
            $foldPairs1_32[$hashed] .= "    { 0x$code, 0x$map1 },\n";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   146
        } elsif (not defined($map2)) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   147
            my $hashed = (($hexxed ^ ($hexxed >> 8)) & ($HASHBUCKETS1_16-1));
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   148
            #print("// hexxed '$hexxed'  hashed1 '$hashed'\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   149
            $foldPairs1_16[$hashed] .= "    { 0x$code, 0x$map1 },\n";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   150
        } elsif (not defined($map3)) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   151
            my $hashed = (($hexxed ^ ($hexxed >> 8)) & ($HASHBUCKETS2_16-1));
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   152
            #print("// hexxed '$hexxed'  hashed2 '$hashed'\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   153
            $foldPairs2_16[$hashed] .= "    { 0x$code, 0x$map1, 0x$map2 },\n";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   154
        } else {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   155
            my $hashed = (($hexxed ^ ($hexxed >> 8)) & ($HASHBUCKETS3_16-1));
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   156
            #print("// hexxed '$hexxed'  hashed3 '$hashed'\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   157
            $foldPairs3_16[$hashed] .= "    { 0x$code, 0x$map1, 0x$map2, 0x$map3 },\n";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   158
        }
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   159
    }
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   160
}
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   161
close(FH);
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   162
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   163
for (my $i = 0; $i < $HASHBUCKETS1_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   164
    $foldPairs1_16[$i] =~ s/,\n\Z//;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   165
    my $str = $foldPairs1_16[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   166
    next if $str eq '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   167
    my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   168
    $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   169
    my $sym = "case_fold1_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   170
    print("static const CaseFoldMapping1_16 ${sym}[] = {\n$str\n};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   171
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   172
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   173
for (my $i = 0; $i < $HASHBUCKETS1_32; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   174
    $foldPairs1_32[$i] =~ s/,\n\Z//;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   175
    my $str = $foldPairs1_32[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   176
    next if $str eq '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   177
    my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   178
    $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   179
    my $sym = "case_fold1_32_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   180
    print("static const CaseFoldMapping1_32 ${sym}[] = {\n$str\n};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   181
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   182
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   183
for (my $i = 0; $i < $HASHBUCKETS2_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   184
    $foldPairs2_16[$i] =~ s/,\n\Z//;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   185
    my $str = $foldPairs2_16[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   186
    next if $str eq '';
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   187
    my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   188
    $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   189
    my $sym = "case_fold2_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   190
    print("static const CaseFoldMapping2_16 ${sym}[] = {\n$str\n};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   191
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   192
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   193
for (my $i = 0; $i < $HASHBUCKETS3_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   194
    $foldPairs3_16[$i] =~ s/,\n\Z//;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   195
    my $str = $foldPairs3_16[$i];
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   196
    next if $str eq '';
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   197
    my $num = '000' . $i;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   198
    $num =~ s/\A.*?(\d\d\d)\Z/$1/;
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   199
    my $sym = "case_fold3_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   200
    print("static const CaseFoldMapping3_16 ${sym}[] = {\n$str\n};\n\n");
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   201
}
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   202
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   203
print("static const CaseFoldHashBucket1_16 case_fold_hash1_16[] = {\n");
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   204
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   205
for (my $i = 0; $i < $HASHBUCKETS1_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   206
    my $str = $foldPairs1_16[$i];
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   207
    if ($str eq '') {
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   208
        print("    { NULL, 0 },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   209
    } else {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   210
        my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   211
        $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   212
        my $sym = "case_fold1_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   213
        print("    { $sym, __PHYSFS_ARRAYLEN($sym) },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   214
    }
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   215
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   216
print("};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   217
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   218
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   219
print("static const CaseFoldHashBucket1_32 case_fold_hash1_32[] = {\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   220
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   221
for (my $i = 0; $i < $HASHBUCKETS1_32; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   222
    my $str = $foldPairs1_32[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   223
    if ($str eq '') {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   224
        print("    { NULL, 0 },\n");
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   225
    } else {
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   226
        my $num = '000' . $i;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   227
        $num =~ s/\A.*?(\d\d\d)\Z/$1/;
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   228
        my $sym = "case_fold1_32_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   229
        print("    { $sym, __PHYSFS_ARRAYLEN($sym) },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   230
    }
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   231
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   232
print("};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   233
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   234
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   235
print("static const CaseFoldHashBucket2_16 case_fold_hash2_16[] = {\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   236
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   237
for (my $i = 0; $i < $HASHBUCKETS2_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   238
    my $str = $foldPairs2_16[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   239
    if ($str eq '') {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   240
        print("    { NULL, 0 },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   241
    } else {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   242
        my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   243
        $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   244
        my $sym = "case_fold2_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   245
        print("    { $sym, __PHYSFS_ARRAYLEN($sym) },\n");
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   246
    }
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   247
}
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   248
print("};\n\n");
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   249
1555
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   250
print("static const CaseFoldHashBucket3_16 case_fold_hash3_16[] = {\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   251
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   252
for (my $i = 0; $i < $HASHBUCKETS3_16; $i++) {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   253
    my $str = $foldPairs3_16[$i];
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   254
    if ($str eq '') {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   255
        print("    { NULL, 0 },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   256
    } else {
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   257
        my $num = '000' . $i;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   258
        $num =~ s/\A.*?(\d\d\d)\Z/$1/;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   259
        my $sym = "case_fold3_16_${num}";
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   260
        print("    { $sym, __PHYSFS_ARRAYLEN($sym) },\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   261
    }
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   262
}
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   263
print("};\n\n");
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   264
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   265
print <<__EOF__;
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   266
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   267
#endif  /* _INCLUDE_PHYSFS_CASEFOLDING_H_ */
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   268
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   269
/* end of physfs_casefolding.h ... */
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   270
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   271
__EOF__
5495a0e50b5e utf8: big improvements to case-insensitive UTF-8 string compare.
Ryan C. Gordon <icculus@icculus.org>
parents: 1373
diff changeset
   272
828
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   273
exit 0;
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   274
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   275
# end of makecashfoldhashtable.pl ...
ee871d51510d Bunch of work on Unicode...added case-folding stricmp, removed
Ryan C. Gordon <icculus@icculus.org>
parents:
diff changeset
   276