Open CASCADE Technology 7.8.0
Data Structures | Public Member Functions
NCollection_UtfIterator< Type > Class Template Reference

Template class for Unicode strings support. More...

#include <NCollection_UtfIterator.hxx>

Public Member Functions

 NCollection_UtfIterator (const Type *theString)
 Constructor.
 
void Init (const Type *theString)
 Initialize iterator within specified NULL-terminated string.
 
NCollection_UtfIteratoroperator++ ()
 Pre-increment operator. Reads the next unicode symbol. Notice - no protection against overrun!
 
NCollection_UtfIterator operator++ (int)
 Post-increment operator. Notice - no protection against overrun!
 
bool operator== (const NCollection_UtfIterator &theRight) const
 Equality operator.
 
bool IsValid () const
 Return true if Unicode symbol is within valid range.
 
Standard_Utf32Char operator* () const
 Dereference operator.
 
const Type * BufferHere () const
 Buffer-fetching getter.
 
Type * ChangeBufferHere ()
 Buffer-fetching getter. Dangerous! Iterator should be reinitialized on buffer change.
 
const Type * BufferNext () const
 Buffer-fetching getter.
 
Standard_Integer Index () const
 
Standard_Integer AdvanceBytesUtf8 () const
 
Standard_Integer AdvanceBytesUtf16 () const
 
Standard_Integer AdvanceCodeUnitsUtf16 () const
 
Standard_Integer AdvanceBytesUtf32 () const
 
Standard_Utf8CharGetUtf8 (Standard_Utf8Char *theBuffer) const
 Fill the UTF-8 buffer within current Unicode symbol. Use method AdvanceUtf8() to allocate buffer with enough size.
 
Standard_Utf8UCharGetUtf8 (Standard_Utf8UChar *theBuffer) const
 
Standard_Utf16CharGetUtf16 (Standard_Utf16Char *theBuffer) const
 Fill the UTF-16 buffer within current Unicode symbol. Use method AdvanceUtf16() to allocate buffer with enough size.
 
Standard_Utf32CharGetUtf32 (Standard_Utf32Char *theBuffer) const
 Fill the UTF-32 buffer within current Unicode symbol. Use method AdvanceUtf32() to allocate buffer with enough size.
 
template<typename TypeWrite >
Standard_Integer AdvanceBytesUtf () const
 
template<typename TypeWrite >
TypeWriteGetUtf (TypeWrite *theBuffer) const
 Fill the UTF-** buffer within current Unicode symbol. Use method AdvanceUtf**() to allocate buffer with enough size.
 

Detailed Description

template<typename Type>
class NCollection_UtfIterator< Type >

Template class for Unicode strings support.

It defines an iterator and provide correct way to read multi-byte text (UTF-8 and UTF-16) and convert it from one to another. The current value of iterator is returned as UTF-32 Unicode symbol.

Here and below term "Unicode symbol" is used as synonym of "Unicode code point".

Constructor & Destructor Documentation

◆ NCollection_UtfIterator()

template<typename Type >
NCollection_UtfIterator< Type >::NCollection_UtfIterator ( const Type *  theString)
inline

Constructor.

Parameters
theStringbuffer to iterate

Member Function Documentation

◆ AdvanceBytesUtf()

template<typename Type >
template<typename TypeWrite >
Standard_Integer NCollection_UtfIterator< Type >::AdvanceBytesUtf ( ) const
inline
Returns
the advance in TypeWrite chars needed to store current symbol

◆ AdvanceBytesUtf16()

template<typename Type >
Standard_Integer NCollection_UtfIterator< Type >::AdvanceBytesUtf16 ( ) const
Returns
the advance in bytes to store current symbol in UTF-16. 0 means an invalid symbol; 2 bytes is a general case; 4 bytes for surrogate pair.

◆ AdvanceBytesUtf32()

template<typename Type >
Standard_Integer NCollection_UtfIterator< Type >::AdvanceBytesUtf32 ( ) const
inline
Returns
the advance in bytes to store current symbol in UTF-32. Always 4 bytes (method for consistency).

◆ AdvanceBytesUtf8()

template<typename Type >
Standard_Integer NCollection_UtfIterator< Type >::AdvanceBytesUtf8 ( ) const
Returns
the advance in bytes to store current symbol in UTF-8. 0 means an invalid symbol; 1-4 bytes are valid range.

◆ AdvanceCodeUnitsUtf16()

template<typename Type >
Standard_Integer NCollection_UtfIterator< Type >::AdvanceCodeUnitsUtf16 ( ) const
Returns
the advance in bytes to store current symbol in UTF-16. 0 means an invalid symbol; 1 16-bit code unit is a general case; 2 16-bit code units for surrogate pair.

◆ BufferHere()

template<typename Type >
const Type * NCollection_UtfIterator< Type >::BufferHere ( ) const
inline

Buffer-fetching getter.

◆ BufferNext()

template<typename Type >
const Type * NCollection_UtfIterator< Type >::BufferNext ( ) const
inline

Buffer-fetching getter.

◆ ChangeBufferHere()

template<typename Type >
Type * NCollection_UtfIterator< Type >::ChangeBufferHere ( )
inline

Buffer-fetching getter. Dangerous! Iterator should be reinitialized on buffer change.

◆ GetUtf()

template<typename Type >
template<typename TypeWrite >
TypeWrite * NCollection_UtfIterator< Type >::GetUtf ( TypeWrite theBuffer) const
inline

Fill the UTF-** buffer within current Unicode symbol. Use method AdvanceUtf**() to allocate buffer with enough size.

Parameters
theBufferbuffer to fill
Returns
new buffer position (for next char)

◆ GetUtf16()

template<typename Type >
Standard_Utf16Char * NCollection_UtfIterator< Type >::GetUtf16 ( Standard_Utf16Char theBuffer) const

Fill the UTF-16 buffer within current Unicode symbol. Use method AdvanceUtf16() to allocate buffer with enough size.

Parameters
theBufferbuffer to fill
Returns
new buffer position (for next char)

◆ GetUtf32()

template<typename Type >
Standard_Utf32Char * NCollection_UtfIterator< Type >::GetUtf32 ( Standard_Utf32Char theBuffer) const

Fill the UTF-32 buffer within current Unicode symbol. Use method AdvanceUtf32() to allocate buffer with enough size.

Parameters
theBufferbuffer to fill
Returns
new buffer position (for next char)

◆ GetUtf8() [1/2]

template<typename Type >
Standard_Utf8Char * NCollection_UtfIterator< Type >::GetUtf8 ( Standard_Utf8Char theBuffer) const

Fill the UTF-8 buffer within current Unicode symbol. Use method AdvanceUtf8() to allocate buffer with enough size.

Parameters
theBufferbuffer to fill
Returns
new buffer position (for next char)

◆ GetUtf8() [2/2]

template<typename Type >
Standard_Utf8UChar * NCollection_UtfIterator< Type >::GetUtf8 ( Standard_Utf8UChar theBuffer) const

◆ Index()

template<typename Type >
Standard_Integer NCollection_UtfIterator< Type >::Index ( ) const
inline
Returns
the index displacement from iterator initialization (first symbol has index 0)

◆ Init()

template<typename Type >
void NCollection_UtfIterator< Type >::Init ( const Type *  theString)
inline

Initialize iterator within specified NULL-terminated string.

◆ IsValid()

template<typename Type >
bool NCollection_UtfIterator< Type >::IsValid ( ) const
inline

Return true if Unicode symbol is within valid range.

◆ operator*()

template<typename Type >
Standard_Utf32Char NCollection_UtfIterator< Type >::operator* ( ) const
inline

Dereference operator.

Returns
the UTF-32 codepoint of the symbol currently pointed by iterator.

◆ operator++() [1/2]

template<typename Type >
NCollection_UtfIterator & NCollection_UtfIterator< Type >::operator++ ( )
inline

Pre-increment operator. Reads the next unicode symbol. Notice - no protection against overrun!

◆ operator++() [2/2]

Post-increment operator. Notice - no protection against overrun!

◆ operator==()

template<typename Type >
bool NCollection_UtfIterator< Type >::operator== ( const NCollection_UtfIterator< Type > &  theRight) const
inline

Equality operator.


The documentation for this class was generated from the following file: