Category Archives: Java

Resource Scheduler , Calculator, Short-Circuit in Hadoop YARN and HDFS

In order to execute the next-year plan, I search the research topics and technologies in Hadoop YARN and HDFS, then make a note as follows: Since Hadoop YARN was proposed, the new generation technology are continusly discussed. For knowing the … Continue reading

Posted in hadoop, Java, Programming, 學術研究, 程式設計 | Leave a comment

Process SequenceFile without Enabling Hadoop Platform

Recently I got a requirement for reading Hadoop’s SequenceFile without enabling Hadoop Platform. However, most examples introduce the read/write SequenceFile with Hadoop Platform. How do I read such files without hadoop? There’s a tricky solution in this case. 1. Download … Continue reading

Posted in hadoop, Java, Programming, 程式設計 | Leave a comment

Print output while Processing HTML/XML data in Jsoup Project

Currently, I encountered one problem while retrieving XML data from one website. In my case, assume that the original XML document is like <result><device /><name>Allen’s device</name></result> If I use Jsoup.parse(File, “UTF-8”); without additional options, the returned document object will be like: … Continue reading

Posted in Java, Programming, XML, 程式設計, 網路 java | Leave a comment

How to fast calculate ( I mod N)?

Given integer I and an integer N which is power of 2, how does it work faster to calculate “I mod N”? OpenJDK’s java.util.HashMap.indexFor method gives us a best solution for it. It simply calculates ” I (bitwise AND) (N-1) … Continue reading

Posted in Java, Programming, 程式設計 | Leave a comment

Install JDK in Debian

1. Donload jdk package from Oracle: wget –header “Cookie: oraclelicense=accept-securebackup-cookie” http://download.oracle.com/otn-pub/java/jdk/8u40-b26/jdk-8u40-linux-x64.tar.gz 2. Setop JDK Environmentsudo sumkdir /opt/jdktar -zxf jdk-8u40-linux-x64.tar.gz -C /opt/jdkupdate-alternatives –install /usr/bin/java java /opt/jdk/jdk1.8.0_40/bin/java 100update-alternatives –install /usr/bin/javac javac /opt/jdk/jdk1.8.0_40/bin/javac 100 3. Check Java Environment. To check the java environment, … Continue reading

Posted in Java, Linux | Leave a comment

Lucene中文斷詞

如果要使用Lucene的斷詞程式,最好看一下 1. Lucene介紹投影片 (推薦) 2. Lucene簡介 (推薦) 3. 當前幾個主要的Lucene中文分詞器的比較 4.  Lucene 3.0的中文分詞系統 (推薦) 5. Lucene 最新版4.6.1 內建的Smart中文斷詞 (推薦) 6. IKAnalyzer for Lucene 4.x版本 目前的Lucene斷詞系統都以支援簡體中文為先,如果要用繁體的話,就是用繁體轉簡體的API來製作。 JCC: A Java Chinese Covertor 懶得研究這麼多的話,可以直接使用Solr (基於Lucene實現的一個production) 1. Apache Solr 介紹(有寫說怎麼設定使用Solr斷詞,但還是以簡體字為主)

Posted in Java, Programming, 程式設計, 資工, 軟體(Software) | Leave a comment

I-List: Create Your Own Lists of Links

I-List is a very helpful links share platform to share your collected links. It provides an user friendly interface to you for sharing some kind of links. I’ve found that there are two collections which is very helpful for some … Continue reading

Posted in C, C/C++, cloud computing, Java, Linux, Linuxamp;FreeBSD, Windows, 科技, 程式設計, 資工, 軟體(Software), 雲端運算 | Leave a comment

XPATH, MapReduce and SAX

SAX Create XPath XPath4Sax SPEX: XPath Evaluation against XML Streams Apache MRQL: MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, and Spark.

Posted in hadoop, Java, XML, 程式設計, 雲端運算 | Leave a comment

[Program] The Framework for Auto Updating Programs

I’ve made up of some frameworks for auto updating program. If you want to make your program with fashion of auto-updating, you can refer to the following table. Table of Auto Update Program Framework Name Tutorials Platform License Dev. Lang. … Continue reading

Posted in C, C/C++, Java, Programming, 程式設計, 軟體(Software) | Leave a comment

[Hadoop] Hadoop 安裝與國網中心Hadoop實作

在中部某科大上課,教到Hadoop,就把一些教材更正的釋出。Hadoop單機安裝這份跟國網中心提供的單機安裝教學有些不同,差異性在於Hadoop 0.22.x啟動方法跟如何安裝JDK 1.7。請點此觀看在國家高速網路中心Hadoop下實作教到如何在國家高速網路中心上使用Hadoop叢集,順便把這一份教材釋出。 請點此觀看

Posted in cloud computing, hadoop, Java, 程式設計 | Leave a comment