2005-05-28 00:45:31 +02:00
|
|
|
/*
|
|
|
|
*
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
* The contents of this file are subject to the Initial
|
|
|
|
* Developer's Public License Version 1.0 (the "License");
|
|
|
|
* you may not use this file except in compliance with the
|
|
|
|
* License. You may obtain a copy of the License at
|
|
|
|
* http://www.ibphoenix.com/idpl.html.
|
|
|
|
*
|
|
|
|
* Software distributed under the License is distributed on
|
|
|
|
* an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either
|
|
|
|
* express or implied. See the License for the specific
|
2005-05-28 00:45:31 +02:00
|
|
|
* language governing rights and limitations under the License.
|
|
|
|
*
|
|
|
|
* The contents of this file or any work derived from this file
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
* may not be distributed under any other license whatsoever
|
|
|
|
* without the express prior written permission of the original
|
2005-05-28 00:45:31 +02:00
|
|
|
* author.
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* The Original Code was created by James A. Starkey for IBPhoenix.
|
|
|
|
*
|
|
|
|
* Copyright (c) 1997 - 2000, 2001, 2003 James A. Starkey
|
|
|
|
* Copyright (c) 1997 - 2000, 2001, 2003 Netfrastructure, Inc.
|
|
|
|
* All Rights Reserved.
|
|
|
|
*/
|
|
|
|
|
|
|
|
// Lex.cpp: implementation of the Lex class.
|
|
|
|
//
|
|
|
|
//////////////////////////////////////////////////////////////////////
|
|
|
|
|
|
|
|
#include <string.h>
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <memory.h>
|
|
|
|
#include "firebird.h"
|
|
|
|
#include "Lex.h"
|
|
|
|
#include "AdminException.h"
|
|
|
|
#include "InputStream.h"
|
|
|
|
|
|
|
|
#define WHITE_SPACE " \t\n\r"
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
//#define PUNCTUATION_CHARS "<>="
|
|
|
|
//#define MULTI_CHARS "" //"+=*/%!~<>~^|&="
|
|
|
|
//#define MAX_TOKEN 1024
|
2005-05-28 00:45:31 +02:00
|
|
|
#define UPCASE(c) ((c >= 'a' && c <= 'z') ? c - 'a' + 'A' : c)
|
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
const int TYPE_WHITE = 1;
|
|
|
|
const int TYPE_PUNCT = 2;
|
|
|
|
//const int TYPE_MULTI_CHAR = 4;
|
|
|
|
const int TYPE_DIGIT = 8;
|
|
|
|
//#define TERM (TYPE_WHITE | TYPE_PUNCT)
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
|
|
|
|
//////////////////////////////////////////////////////////////////////
|
|
|
|
// Construction/Destruction
|
|
|
|
//////////////////////////////////////////////////////////////////////
|
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
Lex::Lex(const char* punctuation, const LEX_flags debugFlags)
|
2005-05-28 00:45:31 +02:00
|
|
|
{
|
|
|
|
lineComment = NULL;
|
|
|
|
commentStart = NULL;
|
2006-06-15 03:53:25 +02:00
|
|
|
memset (charTableArray, 0, sizeof (charTableArray));
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
setCharacters (TYPE_PUNCT, punctuation);
|
|
|
|
setCharacters (TYPE_WHITE, WHITE_SPACE);
|
|
|
|
setCharacters (TYPE_DIGIT, "0123456789");
|
2005-05-28 00:45:31 +02:00
|
|
|
ptr = end = NULL;
|
|
|
|
inputStream = NULL;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
tokenType = TT_NONE;
|
2005-05-28 00:45:31 +02:00
|
|
|
lineNumber = 0;
|
|
|
|
continuationChar = 0;
|
|
|
|
captureStart = captureEnd = 0;
|
|
|
|
flags = debugFlags;
|
|
|
|
}
|
|
|
|
|
|
|
|
Lex::~Lex()
|
|
|
|
{
|
|
|
|
if (inputStream)
|
|
|
|
inputStream->release();
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::skipWhite()
|
|
|
|
{
|
|
|
|
for (;;)
|
2008-04-20 12:40:58 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
while (ptr >= end)
|
2008-04-20 12:40:58 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (!getSegment())
|
|
|
|
return;
|
2008-04-20 12:40:58 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
while (ptr < end)
|
2008-04-20 12:40:58 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (lineComment && lineComment [0] == *ptr && match (lineComment, ptr))
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
while (ptr < end && *ptr++ != '\n')
|
|
|
|
;
|
|
|
|
++inputStream->lineNumber;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else if (commentStart && commentStart [0] == *ptr && match (commentStart, ptr))
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
ptr += strlen (commentStart);
|
|
|
|
while (ptr < end)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (commentEnd [0] == *ptr && match (commentEnd, ptr))
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
ptr += strlen (commentEnd);
|
|
|
|
break;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
if (*ptr++ == '\n')
|
2005-05-28 00:45:31 +02:00
|
|
|
++inputStream->lineNumber;
|
|
|
|
}
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else if (*ptr == continuationChar && ptr [1] == '\n')
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
ptr += 2;
|
|
|
|
++inputStream->lineNumber;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
|
|
|
else if (charTable(*ptr) & TYPE_WHITE)
|
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (*ptr++ == '\n')
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
eol = true;
|
|
|
|
++inputStream->lineNumber;
|
|
|
|
}
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else
|
|
|
|
return;
|
|
|
|
}
|
2008-04-20 12:40:58 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
|
|
|
|
2006-04-06 10:18:53 +02:00
|
|
|
// Just another custom memcmp-like routine.
|
2005-05-28 00:45:31 +02:00
|
|
|
bool Lex::match(const char *pattern, const char *string)
|
|
|
|
{
|
|
|
|
while (*pattern && *string)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (*pattern++ != *string++)
|
|
|
|
return false;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
return *pattern == 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::getToken()
|
|
|
|
{
|
|
|
|
priorInputStream = tokenInputStream;
|
|
|
|
priorLineNumber = tokenLineNumber;
|
|
|
|
|
|
|
|
if (tokenType == END_OF_STREAM)
|
|
|
|
throw AdminException ("expected token, got end-of-file");
|
|
|
|
|
|
|
|
eol = false;
|
|
|
|
skipWhite();
|
|
|
|
|
|
|
|
if (tokenInputStream = inputStream)
|
|
|
|
tokenLineNumber = inputStream->lineNumber;
|
|
|
|
|
|
|
|
if (ptr >= end)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
tokenType = END_OF_STREAM;
|
|
|
|
strcpy (token, "-end-of-file-");
|
|
|
|
return;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
tokenOffset = inputStream->getOffset (ptr);
|
|
|
|
char *p = token;
|
2008-04-20 13:46:02 +02:00
|
|
|
const char* const endToken = token + sizeof(token) - 1; // take into account the '\0'
|
|
|
|
const char c = *p++ = *ptr++;
|
2005-05-28 00:45:31 +02:00
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
if (charTable(c) & TYPE_PUNCT)
|
|
|
|
tokenType = TT_PUNCT;
|
2005-05-28 00:45:31 +02:00
|
|
|
else if (c == '\'' || c == '"')
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
p = token;
|
|
|
|
for (;;)
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (ptr >= end)
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (!getSegment())
|
|
|
|
throw AdminException ("end of file in quoted string");
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else if (*ptr == c)
|
|
|
|
break;
|
|
|
|
else
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (p >= endToken)
|
|
|
|
throw AdminException ("token overflow in quoted string");
|
|
|
|
*p++ = *ptr++;
|
|
|
|
}
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
++ptr;
|
|
|
|
tokenType = (c == '"') ? QUOTED_STRING : SINGLE_QUOTED_STRING;
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
else if (charTable(c) & TYPE_DIGIT)
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
tokenType = TT_NUMBER;
|
|
|
|
while (ptr < end && (charTable(*ptr) & TYPE_DIGIT))
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
|
|
|
if (p >= endToken)
|
|
|
|
throw AdminException ("token overflow in number");
|
2005-05-28 00:45:31 +02:00
|
|
|
*p++ = *ptr++;
|
|
|
|
}
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
tokenType = TT_NAME;
|
2005-05-28 00:45:31 +02:00
|
|
|
if (flags & LEX_upcase)
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
p [-1] = UPCASE(c);
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
while (ptr < end && !(charTable(*ptr) & (TYPE_WHITE | TYPE_PUNCT)))
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
|
|
|
if (p >= endToken)
|
|
|
|
throw AdminException ("token overflow in name (uppercase)");
|
|
|
|
const char c2 = *ptr++;
|
|
|
|
*p++ = UPCASE(c2);
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
else
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
while (ptr < end && !(charTable(*ptr) & (TYPE_WHITE | TYPE_PUNCT)))
|
2008-04-20 13:46:02 +02:00
|
|
|
{
|
|
|
|
if (p >= endToken)
|
|
|
|
throw AdminException ("token overflow in name");
|
2005-05-28 00:45:31 +02:00
|
|
|
*p++ = *ptr++;
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
2008-04-20 13:46:02 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
*p = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::setCharacters(int type, const char *characters)
|
|
|
|
{
|
|
|
|
for (const char *p = characters; *p; ++p)
|
2008-04-20 13:46:02 +02:00
|
|
|
charTable(*p) |= type;
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/***
|
|
|
|
void Lex::openFile(const char *fileName)
|
|
|
|
{
|
|
|
|
inputStream = new InputFile (fileName, inputStream);
|
|
|
|
}
|
|
|
|
***/
|
|
|
|
|
|
|
|
void Lex::setLineComment(const char *string)
|
|
|
|
{
|
|
|
|
lineComment = string;
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::setCommentString(const char *start, const char *cend)
|
|
|
|
{
|
|
|
|
commentStart = start;
|
|
|
|
commentEnd = cend;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2006-04-06 10:18:53 +02:00
|
|
|
bool Lex::isKeyword(const char *word) const
|
2005-05-28 00:45:31 +02:00
|
|
|
{
|
|
|
|
return strcmp (token, word) == 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool Lex::match(const char *word)
|
|
|
|
{
|
|
|
|
if (!isKeyword (word))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
if (*word == captureStart)
|
|
|
|
captureStuff();
|
|
|
|
|
|
|
|
getToken();
|
|
|
|
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2008-04-24 17:49:43 +02:00
|
|
|
Firebird::PathName Lex::reparseFilename()
|
2005-05-28 00:45:31 +02:00
|
|
|
{
|
|
|
|
char *p = token;
|
|
|
|
|
|
|
|
while (*p)
|
|
|
|
++p;
|
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
while (ptr < end && *ptr != '>' && !(charTable(*ptr) & TYPE_WHITE))
|
2005-05-28 00:45:31 +02:00
|
|
|
*p++ = *ptr++;
|
|
|
|
|
|
|
|
*p = 0;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
const Firebird::PathName string = token;
|
2005-05-28 00:45:31 +02:00
|
|
|
//getToken();
|
|
|
|
|
|
|
|
return string;
|
|
|
|
}
|
|
|
|
|
2008-04-24 17:49:43 +02:00
|
|
|
Firebird::string Lex::getName()
|
2005-05-28 00:45:31 +02:00
|
|
|
{
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
if (tokenType != TT_NAME)
|
2005-05-28 00:45:31 +02:00
|
|
|
syntaxError ("name");
|
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
const Firebird::string name = token;
|
2005-05-28 00:45:31 +02:00
|
|
|
getToken();
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
|
|
|
|
return name;
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::syntaxError(const char *expected)
|
|
|
|
{
|
|
|
|
AdminException exception ("expected %s, got \"%s\"", expected, token);
|
|
|
|
|
|
|
|
if (tokenInputStream)
|
|
|
|
exception.setLocation (tokenInputStream->getFileName(), tokenLineNumber);
|
|
|
|
|
|
|
|
throw exception;
|
|
|
|
}
|
|
|
|
|
|
|
|
/***
|
|
|
|
InputFile* Lex::pushFile(const char *fileName)
|
|
|
|
{
|
|
|
|
if (inputStream)
|
|
|
|
inputStream->ptr = ptr;
|
|
|
|
|
|
|
|
InputFile *inputFile = new InputFile (fileName, inputStream);
|
|
|
|
inputStream = inputFile;
|
|
|
|
ptr = end = NULL;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
tokenType = TT_NONE;
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
return inputFile;
|
|
|
|
}
|
|
|
|
***/
|
|
|
|
|
|
|
|
void Lex::setContinuationChar(char c)
|
|
|
|
{
|
|
|
|
continuationChar = c;
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::pushStream(InputStream *stream)
|
|
|
|
{
|
|
|
|
stream->addRef();
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
|
2005-05-28 00:45:31 +02:00
|
|
|
if (flags & LEX_trace)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
const char *fileName = stream->getFileName();
|
|
|
|
if (fileName)
|
|
|
|
printf ("Opening %s\n", fileName);
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
if (inputStream)
|
|
|
|
inputStream->ptr = ptr;
|
|
|
|
|
|
|
|
stream->prior = inputStream;
|
|
|
|
inputStream = stream;
|
|
|
|
ptr = end = NULL;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
tokenType = TT_NONE;
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
bool Lex::getSegment()
|
|
|
|
{
|
|
|
|
if (!inputStream)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
tokenType = END_OF_STREAM;
|
|
|
|
eol = true;
|
|
|
|
return false;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
if (!(ptr = inputStream->getSegment()))
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
end = ptr;
|
|
|
|
InputStream *prior = inputStream->prior;
|
|
|
|
inputStream->close();
|
|
|
|
inputStream->release();
|
|
|
|
if (!(inputStream = prior))
|
|
|
|
return false;
|
|
|
|
ptr = inputStream->ptr;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
|
2009-02-28 12:57:40 +01:00
|
|
|
end = ptr ? inputStream->getEnd() : NULL;
|
2005-05-28 00:45:31 +02:00
|
|
|
|
|
|
|
if (end && (flags & LEX_list))
|
|
|
|
printf (" %s", ptr);
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
|
2005-05-28 00:45:31 +02:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
void Lex::captureStuff()
|
|
|
|
{
|
|
|
|
stuff.clear();
|
|
|
|
|
|
|
|
for (;;)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (ptr >= end)
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
{
|
2005-05-28 00:45:31 +02:00
|
|
|
if (!getSegment())
|
|
|
|
return;
|
|
|
|
continue;
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
if (*ptr == captureEnd)
|
|
|
|
return;
|
|
|
|
stuff.putCharacter (*ptr++);
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
}
|
2005-05-28 00:45:31 +02:00
|
|
|
}
|
2006-06-15 03:53:25 +02:00
|
|
|
|
1.- Style.
2.- Cleanup.
3.- Fix what I assume may cause astray behavior. Only an inept could create an enumeration in Lex.h like this
enum TokenType {
END_OF_STREAM,
PUNCT,
NAME,
to be used in the data member tokenType but at the same time, create preprocessor macros like this
#define WHITE 1
#define PUNCT 2
to be stored and retrieved by
char charTableArray [256]
to calculate the character class (punctuation, spaces, etc) in the Lexer,
where the macro PUNCT (value 2) overrides the enum member PUNCT (value 1) and that inconsistent value is used in both tasks, causing PUNCT to be interpreted as tokenType being NAME (value 2 in the enum). Since this module has several bugs, maybe all the bugs cancel among themselves and all works as expected, but it would be pure luck.
2008-04-29 13:05:11 +02:00
|
|
|
int& Lex::charTable(int ch)
|
2006-06-15 03:53:25 +02:00
|
|
|
{
|
|
|
|
return charTableArray [static_cast<UCHAR>(ch)];
|
|
|
|
}
|