Friday, December 19, 2008

Text

STRING
* String type is derived immediately from object, making it a reference type.
* Implements several interfaces like Icomparable, Icloneable, Iconvertible, IEnumerable
* primitive type
* immutable - once created, a string can never get longer, get shorter, or have any of its characters changed.
Equals - checks if same object and returns true otherwise checks if all characters are same n return true

STRINGBUILDER
* Internally , a stringbuilder object has a field that refers to an array of char structures.
* performs dynamic operations with strings and characters to create a string.
* convert the stringbuilder's character array into a string by calling tostring. it will return the refernce to string field of stringbuilder
* string returned is immutable.
* attempt to edit stringbuilder after tostring was called will return a new stringbuilder object
* mutable string

Encoding: conversion between characters and bytes
* In CLR, all characters are represented as 16-bit Unicode code values and all strings are composed of 16-bit Unicode code values.
* transmitting 16th bit value isnt very efficient as half of the bytes written would contain zeros.
* Better encode it to compress it in array of bytes..transmit it then again decode back to 16th bit values
* encoding is done for system.IO.binarywriter and streamwriter and decoding is done for system.io.binaryreader or streamreader.
* no encoding then default is UTF-8

Most frequently used encoding are UTF-16 and UTF-8
UTF-16
* encode 16-bit character as 2 bites.
* no compression
* excellent performance
* also known as Unicode Encoding.

UTF-8
* encodes some as 1 byte,2byte,3byte and four byte.
* less useful than UTF-16 when encoding for 4byte is done mostly.

UTF-7
* for characters expressed using 7-bit values.
* rather avoid as its ends with expanding data instead of compressing.

ASCII
* encodes 16 bit characters in ASCII characters.
* any values less than 256 can be converted into single byte
* for greater the value is lost
* compressed data in half

to encode or decode ..create instance of class derived from system.text.encoding.

Comparing Strings
* Equals - calls CompareOrdinal method- check for same ref first then compare the characters.
* CompareOrdinal - checks for same characters.- always case sensitive - fast
* Compare - culture specific - logically equal strings

No comments: