EVOLUTION-MANAGER
Edit File: stri_enc_isutf8.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Check If a Data Stream Is Possibly in UTF-8</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <link rel="stylesheet" type="text/css" href="R.css" /> </head><body> <table width="100%" summary="page for stri_enc_isutf8 {stringi}"><tr><td>stri_enc_isutf8 {stringi}</td><td style="text-align: right;">R Documentation</td></tr></table> <h2>Check If a Data Stream Is Possibly in UTF-8</h2> <h3>Description</h3> <p>The function checks whether given sequences of bytes forms a proper UTF-8 string. </p> <h3>Usage</h3> <pre> stri_enc_isutf8(str) </pre> <h3>Arguments</h3> <table summary="R argblock"> <tr valign="top"><td><code>str</code></td> <td> <p>character vector, a raw vector, or a list of <code>raw</code> vectors</p> </td></tr> </table> <h3>Details</h3> <p><code>FALSE</code> means that a string is certainly not valid UTF-8. However, false positives are possible. For instance, <code>(c4,85)</code> represents ("Polish a with ogonek") in UTF-8 as well as ("A umlaut", "Ellipsis") in WINDOWS-1250. Also note that UTF-8, as well as most 8-bit encodings, extend ASCII (note that <code><a href="stri_enc_isascii.html">stri_enc_isascii</a></code> implies that <code><a href="stri_enc_isutf8.html">stri_enc_isutf8</a></code>). </p> <p>However, the longer the sequence, the greater the possibility that the result is indeed in UTF-8 – this is because not all sequences of bytes are valid UTF-8. </p> <p>This function is independent of the way <span style="font-family: Courier New, Courier; color: #666666;"><b>R</b></span> marks encodings in character strings (see <a href="../../base/html/Encoding.html">Encoding</a> and <a href="stringi-encoding.html">stringi-encoding</a>). </p> <h3>Value</h3> <p>Returns a logical vector. Its i-th element indicates whether the i-th string corresponds to a valid UTF-8 byte sequence. </p> <h3>See Also</h3> <p>Other encoding_detection: <code><a href="stri_enc_detect2.html">stri_enc_detect2</a>()</code>, <code><a href="stri_enc_detect.html">stri_enc_detect</a>()</code>, <code><a href="stri_enc_isascii.html">stri_enc_isascii</a>()</code>, <code><a href="stri_enc_isutf16.html">stri_enc_isutf16be</a>()</code>, <code><a href="stringi-encoding.html">stringi-encoding</a></code> </p> <h3>Examples</h3> <pre> stri_enc_isutf8(letters[1:3]) stri_enc_isutf8("\u0105\u0104") stri_enc_isutf8("\u1234\u0222") </pre> <hr /><div style="text-align: center;">[Package <em>stringi</em> version 1.4.6 <a href="00Index.html">Index</a>]</div> </body></html>