You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

95 lines
3.9 KiB

  1. <?php
  2. // Project: Web Reference Database (refbase) <http://www.refbase.net>
  3. // Copyright: Matthias Steffens <mailto:refbase@extracts.de> and the file's
  4. // original author(s).
  5. //
  6. // This code is distributed in the hope that it will be useful,
  7. // but WITHOUT ANY WARRANTY. Please see the GNU General Public
  8. // License for more details.
  9. //
  10. // File: ./includes/transtab_latin1_charset.inc.php
  11. // Repository: $HeadURL$
  12. // Author(s): Matthias Steffens <mailto:refbase@extracts.de>
  13. //
  14. // Created: 24-Jul-08, 17:45
  15. // Modified: $Date: 2008-08-19 20:05:53 +0000 (Tue, 19 Aug 2008) $
  16. // $Author$
  17. // $Revision: 1206 $
  18. // Search & replace patterns and variables for matching (and conversion of) ISO-8859-1 character case & classes.
  19. // Search & replace patterns must be specified as perl-style regular expression and search patterns must include the
  20. // leading & trailing slashes.
  21. // NOTE: Quote from <http://www.onphp5.com/article/22> ("i18n with PHP5: Pitfalls"):
  22. // "PCRE and other regular expression extensions are not locale-aware. This most notably influences the \w class
  23. // that is unable to work for Cyrillic letters. There could be a workaround for this if some preprocessor for the
  24. // regex string could replace \w and friends with character range prior to calling PCRE functions."
  25. // The 'start_session()' function in file 'include.inc.php' should establish an appropriate locale via function
  26. // 'setSystemLocale()' so that e.g. '[[:upper:]]' would also match '�' etc. However, since locale support depends
  27. // on the individual server & system, we keep the workaround which literally specifies higher ASCII chars of the
  28. // latin1 character set below. (in order to have this work, the character encoding of 'search.php' must be set to
  29. // 'Western (Iso Latin 1)' aka 'ISO-8859-1'!)
  30. // higher ASCII chars upper case = "���������������������������"
  31. // higher ASCII chars lower case = "�����������������������������"
  32. // The variables '$alnum', '$alpha', '$cntrl', '$dash', '$digit', '$graph', '$lower', '$print', '$punct', '$space',
  33. // '$upper', '$word' must be used within a perl-style regex character class.
  34. // Matches ISO-8859-1 letters & digits:
  35. $alnum = "[:alnum:]��������������������������������������������������������";
  36. // Matches ISO-8859-1 letters:
  37. $alpha = "[:alpha:]��������������������������������������������������������";
  38. // Matches ISO-8859-1 control characters:
  39. $cntrl = "[:cntrl:]";
  40. // Matches ISO-8859-1 dashes & hyphens:
  41. $dash = "-�";
  42. // Matches ISO-8859-1 digits:
  43. $digit = "[:digit:]";
  44. // Matches ISO-8859-1 printing characters (excluding space):
  45. $graph = "[:graph:]��������������������������������������������������������";
  46. // Matches ISO-8859-1 lower case letters:
  47. $lower = "[:lower:]�����������������������������";
  48. // Matches ISO-8859-1 printing characters (including space):
  49. $print = "[:print:]��������������������������������������������������������";
  50. // Matches ISO-8859-1 punctuation:
  51. $punct = "[:punct:]";
  52. // Matches ISO-8859-1 whitespace (separating characters with no visual representation):
  53. $space = "[:space:]";
  54. // Matches ISO-8859-1 upper case letters:
  55. $upper = "[:upper:]���������������������������";
  56. // Matches ISO-8859-1 "word" characters:
  57. $word = "_[:alnum:]��������������������������������������������������������";
  58. // Defines the PCRE pattern modifier(s) to be used in conjunction with the above variables:
  59. // More info: <http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php>
  60. $patternModifiers = "";
  61. // Converts ISO-8859-1 upper case letters to their corresponding lower case letter:
  62. // TODO!
  63. $transtab_upper_lower = array(
  64. );
  65. // Converts ISO-8859-1 lower case letters to their corresponding upper case letter:
  66. // TODO!
  67. $transtab_lower_upper = array(
  68. );
  69. ?>