Strings
About this chapter
This chapter describes Mops' string-handling classes. Strings are objects that contain variable-length sequences of text, with methods for deletion, insertion etc. Mops' powerful string handling facility provides an excellent base on which you can build various text-based utilities.
Inside Macintosh Text Utilities Toolbox Utilities Mops: Using Strings
: Recommended reading
String pString StrUtilities zString+ String+
: Source files
Using strings
Mops strings are implemented as relocatable blocks of heap that can expand and contract as their contents change. A string object itself contains a handle to the heap block that contains the string's data. It also contains three other ivars which we will describe below.
Strings can be useful for a wide variety of programming needs. They can serve as file buffers, staging areas for text to be printed on the screen, dictionaries, or vehicles for parsing user input. You should consider using strings for any run of bytes whose length and/or contents are likely to change in the course of your program's execution. Strings are not restricted to ASCII text, although that will probably be their most common use. Note, however, that text constants can more efficiently be implemented as SCONs or string literals (see II.5 for more information).
Using strings is somewhat like using files, in that you must open the string before you use it and close it when you're through. This is done by sending a New: message to each string before you use it, to allocate the string's heap storage, and then sending a Release: message when you no longer need the string. Release: is actually inherited from String's superclass, Handle, and calls the Toolbox routine DisposeHandle.
There are two classes of strings in Mops. String supports basic string operations, such as Get:, Put: , Insert: and Add:. Class String+, a subclass of String, adds more methods, such as searching. Both classes are in the precompiled Mops.dic, and are really only split into two classes since String+ has some code methods, which require the Assembler for compilation, whereas we do require some string operations at an earlier point in the building of the full system, before the Assembler is available. But for all practical purposes you can treat the two classes as a single class. This is especially true in PowerPC Mops, where a number of the methods in String+ have been moved to String, because they were needed earlier, and String+ has been rewritten in high-level Mops.
Many of the String methods are built around the Toolbox Utilities routine Munger, which is a general-purpose string-processing primitive. You might read the IM Toolbox Utilities section on Munger to gain a deeper understanding of what characteristics it contributes to Mops string handling.
Strings have a current size, which is the same as the length of the relocatable block of heap containing the string's data. Strings also have two offets into the string data, called POS and LIM. POS marks the 'current' position, and LIM the 'current' end. Most string operations operate on the substring delimited by POS and LIM, which we call the active part of the string, rather than the whole string. We also keep the size of the string (the real size, that is) in an ivar, so that we can get it quickly without a system call.
Communicating with other objects
While most of the method descriptions below should be self-explanatory, several are worth additional comment. One group of String+'s methods takes the address of another String or String+ object as one of its parameters, and accesses the active part of this second string.
String+ also has several methods that simplify its use as a file buffer. ReadN:, ReadRest:, ReadAll: and ReadLine?: all accept a File object as one of the parameters, and will request that the File perform a read into the string, setting the size of the string to the number of bytes actually read. Doing things this way is very convenient, especially as the file data is left in a String+ object, and is therefore subject to all of the various manipulations that String+ can perform.
Finally, String+'s Draw: method accepts a Rect object and a justification parameter, and draws the contents of the string as justified text within the box specified by the rectangle.
Translate tables
Translate tables allow very fast searching of strings for specified sets
of characters. In effect we are separating the specification of what we
are searching for from the actual search operation itself. This allows
an uncluttered and extremely fast search operation (the scan:
,
<scan:
, scax:
, and <scax:
methods of class String+), and it also
allows a very flexible (and easily extensible) choice of what to search
for. The setup time for translate tables can generally be factored out
of inner loops, or done at compile time, and is quite fast, anyway.
Classes
TrTbl
We first define a class (trtbl) which is needed to define the table mapping lower case letters to upper case. This table is then used by some of the methods in the Trtbl class proper. However this is just an implementation convenience --- these classes really should be thought of as one class, so we put all the methods together here.
+-----------------------------+---------------------------------------+ | Superclass | (TrTbl), whose superclass is Object | +=============================+=======================================+ | Source file | StrUtilities zString+ | +-----------------------------+---------------------------------------+ | Status | Core | +-----------------------------+---------------------------------------+ | nowrap |Instance variables | | +-----------------------------+---------------------------------------+ | | Class Name description | | | | | | ----------- -------- ---------------- | | | ------------------------------------- | | | ------------------------------------- | | | int count Used internall | | | y in counting characters selected, so | | | the table bytes can be set correctly | | | | | | 256 bytes TheTbl The table itself | +-----------------------------+---------------------------------------+ | Indexed data | None | +-----------------------------+---------------------------------------+ | System objects | | +-----------------------------+---------------------------------------+ | | name description | | | ------- ----------------- | | | ------------------------------------- | | | ------------------------------------- | | | UCtbl A table which m | | | aps lower case letters to upper case, | | | and leaves everything else unchanged | +-----------------------------+---------------------------------------+
Inherits: (TrTbl), Object
accessing
tbl: selection clear: put: selchars: selchar: nowrap | selcharNC: selRange: invert:
uc: operations transc:
: Methods
Error messages - None
String
String defines a variable-length string object with basic access methods whose data exists as a relocatable block of heap. Size is limited only by available memory.
+-----------------------------+---------------------------------------+ | Superclass | Handle | +=============================+=======================================+ | Source file | String pString | +-----------------------------+---------------------------------------+ | Status | Core | +-----------------------------+---------------------------------------+ | nowrap |Instance variables | | +-----------------------------+---------------------------------------+ | | Class Name description | | | ------- ---- | | | --- --------------------------------- | | | ------------------------------------- | | | ------------------------------------- | | | ------------------------------------- | | | ------------------------------------- | | | ------------------------------------- | | | | | | Var pos Offset into the strin | | | g of the beginning of the active part | | | Var li | | | m One plus the offset of the last | | | char in the active part. Note that i | | | f pos = lim, the active part is empty | | | . Some methods signal an error if pos | | | > lim, or if either is negative o | | | r greater than the size of the string | | | Var size The size of the | | | heap block containing the string data | | | Int | | | flags Various flags are stored here | +-----------------------------+---------------------------------------+ | Indexed data | None | +-----------------------------+---------------------------------------+ | System objects | ??? | +-----------------------------+---------------------------------------+
Inherits: Handle, Var, Longword, Object
+----------------------------------------------------------------------+
| accessing |
+======================================================================+
| handle: |
+----------------------------------------------------------------------+
| pos: |
+----------------------------------------------------------------------+
| >pos: |
+----------------------------------------------------------------------+
| lim: |
+----------------------------------------------------------------------+
| >lim: |
+----------------------------------------------------------------------+
| len: |
+----------------------------------------------------------------------+
| >len: |
+----------------------------------------------------------------------+
| skip: |
+----------------------------------------------------------------------+
| more: |
+----------------------------------------------------------------------+
| start: |
+----------------------------------------------------------------------+
| begin: |
+----------------------------------------------------------------------+
| end: |
+----------------------------------------------------------------------+
| nolim: |
+----------------------------------------------------------------------+
| reset: |
+----------------------------------------------------------------------+
| step: |
+----------------------------------------------------------------------+
| <step: |
+----------------------------------------------------------------------+
| manipulation |
+----------------------------------------------------------------------+
| new: |
+----------------------------------------------------------------------+
| ?new: |
+----------------------------------------------------------------------+
| size: |
+----------------------------------------------------------------------+
| setSize: |
+----------------------------------------------------------------------+
| clear: |
+----------------------------------------------------------------------+
| get: |
+----------------------------------------------------------------------+
| all: |
+----------------------------------------------------------------------+
| 1st: |
+----------------------------------------------------------------------+
| \^1st: |
+----------------------------------------------------------------------+
| uc: |
+----------------------------------------------------------------------+
| put: |
+----------------------------------------------------------------------+
| ->: |
+----------------------------------------------------------------------+
| insert: |
+----------------------------------------------------------------------+
| \$insert: |
+----------------------------------------------------------------------+
| add: |
+----------------------------------------------------------------------+
| \$add |
+----------------------------------------------------------------------+
| +: |
+----------------------------------------------------------------------+
| fill: |
+----------------------------------------------------------------------+
| search: |
+----------------------------------------------------------------------+
| chsearch: |
+----------------------------------------------------------------------+
| object interaction |
+----------------------------------------------------------------------+
| copyto: |
+----------------------------------------------------------------------+
| mark\_original: |
+----------------------------------------------------------------------+
| print: |
+----------------------------------------------------------------------+
| dump: |
+----------------------------------------------------------------------+
| rd: |
+----------------------------------------------------------------------+
| stream interface |
+----------------------------------------------------------------------+
| The stream methods read: and write: are meant to look the same for |
| both strings and files (and for anything else we might think of |
| later). By late binding to an object that supports these, we don't |
| have to know or care exactly what it is. The object gives us bytes |
| or accepts bytes, and tells us whether it was successful, and |
| that's all we have to worry about. |
| |
| For read:, we only use the active part of the string. We update POS |
| by the number of bytes transferred. If we transfer the number asked |
| for, we return a 'no error' code of zero, otherwise -1. |
| (We don't use true and false so as to behave the same way as |
| files). write: is basically the same as add:. There's no way this |
| can fail unless we run out of memory, so we always return zero |
+----------------------------------------------------------------------+
| read: |
+----------------------------------------------------------------------+
| write: |
+----------------------------------------------------------------------+
| persistence/serialization |
+----------------------------------------------------------------------+
| send: |
+----------------------------------------------------------------------+
| bring: |
+----------------------------------------------------------------------+
: Methods
Error messages "String pointer(s) out of bounds" Pos was found to be greater than Lim, or either was negative or greater than the size of the string. Pos and Lim are also displayed when this message is given. We check for this error condition whenever we access the actual characters of the string. Operations such as >pos: don't perform the check --- this is for speed, and also because when we are doing manipulations on Pos and Lim we don't want to put any restriction on intermediate values. "Can't do that on a string copy" You attempted to insert, delete, or change the size of a string object which was flagged as a 'copy'. See above under copyto:.
String+
String+ adds many useful methods to String. Note that in PowerMops, some of the methods listed here are actually defined in class String, since we needed them at that stage for the PowerPC code generator, but this shouldn't affect your source code at all
Superclass String
Source file String+ zString+ Status Core nowrap |Instance variables None (see String) Indexed data None
Inherits: String, Handle, Var, Longword, Object
accessing
-----------------------
swapPos:
save:
restore:
character fetching
2nd:
last:
comparisons
compare:
?:
=?:
ch=?:
searching
search:
<search:
sch&skip:
chsearch:
<chsearch:
chsch&skip:
chskip?:
chskip:
scanning
scan:
<scan:
scax:
<scax:
translate:
trans1st:
>uc:
ch>uc:
chinsert:
ovwr:
chovwr:
\$ovwr:
repl:
\$repl:
sch&repl:
replAll:
delete:
deleteN:
line-oriented methods
line>:
nextline?:
<nextline?:
addline:
\$addline:
I/O methods
readN:
readLine?:
readRest:
readAll:
readTop:
\$write:
send:
bring:
draw:
printAll:
: Methods
**Error messages** - None