前幾天剛做了一個 對文字檔案分析編碼方式以便正確轉碼
CodepageDetectorProxy detector = CodepageDetectorProxy.getInstance();
detector.add(UnicodeDetector.getInstance());
detector.add(JChardetFacade.getInstance());
detector.add(ASCIIDetector.getInstance());
File f = new File(url);
Charset charset = detector.detectCodepage(f.toURI().toURL());
//判斷是否是UTF-8編碼的檔案
if("UTF-8".equals(charset.toString())){
br = new BufferedReader(new InputStreamReader(new FileInputStream(url),"UTF-8"));
} else {
br = new BufferedReader(new InputStreamReader(new FileInputStream(url),"GBK"));
}
可以判斷的編碼有不少 樓主可以輸出試試看
cpdetector_1.0.10和 chardet (jchardet-1.1)這個是依賴jar包
前幾天剛做了一個 對文字檔案分析編碼方式以便正確轉碼
CodepageDetectorProxy detector = CodepageDetectorProxy.getInstance();
detector.add(UnicodeDetector.getInstance());
detector.add(JChardetFacade.getInstance());
detector.add(ASCIIDetector.getInstance());
File f = new File(url);
Charset charset = detector.detectCodepage(f.toURI().toURL());
//判斷是否是UTF-8編碼的檔案
if("UTF-8".equals(charset.toString())){
br = new BufferedReader(new InputStreamReader(new FileInputStream(url),"UTF-8"));
} else {
br = new BufferedReader(new InputStreamReader(new FileInputStream(url),"GBK"));
}
可以判斷的編碼有不少 樓主可以輸出試試看
cpdetector_1.0.10和 chardet (jchardet-1.1)這個是依賴jar包