Yan's Notes

  • Home

  • Tags

  • Categories

  • Archives

  • Search

HDFS UI detailed description

Posted on 2019-02-09 | Edited on 2020-08-01 |
HDFS UIOpen HDFS UI by browser, the URL is {hadoop_master_ip:50070}. This article aims at explaining everything in the HDFS Overview UI in detail, oth ...
Read more »

Basic understanding of debian packaging

Posted on 2019-02-07 | Edited on 2020-08-02 |
BackgroundIt never becomes easy to setup/install/upgrade environments. This article aims at describing how to use/create debian packages to help setup ...
Read more »

hadoop cluster incidents

Posted on 2019-01-25 | Edited on 2020-08-01 |
PurposeThe operation of Hadoop cluster is not easy. There are many parameters and there is no perfect configuration that fit all situations. Monitorin ...
Read more »

Very basic understanding of CAP theorem

Posted on 2019-01-19 | Edited on 2020-08-01 |
Wiki descriptionAccording to the wiki: CAP theorem explains that it is impossible for a distributed data store to simultaneously provide more than two ...
Read more »

hive GenericUDF and DeferredJavaObject analysis

Posted on 2019-01-09 | Edited on 2020-08-01 |
BackgroundThis article aims at discussing how hive generic User-defined function(GenericUDF) works. In the java doc, it says GenericUDF can do short-c ...
Read more »

create git hooks in mac

Posted on 2018-12-15 | Edited on 2020-08-01 | In git |
This article explains how to create git hooks in mac, and how to use the customized git hook chains. Preparation for creating hook chain(This part ref ...
Read more »

hadoop local mode and distributed mode

Posted on 2018-12-09 | Edited on 2020-08-01 | In big data |
Whether a job runs in local mode or distributed mode is decided by mapreduce.framework.name. In local mode, the mapper and reducer will run locally in ...
Read more »

hive configuration understanding

Posted on 2018-11-19 | Edited on 2020-08-01 | In big data |
This article aims at introducing what are the manually configured settings that override the default during using hive. environmentThis article is bas ...
Read more »

hive scratch directory

Posted on 2018-11-17 | Edited on 2020-08-01 | In big data |
This article aims at explaining hive scratch directory. Scratch directory usageHive scratch directory is a temporary working space for storing the pla ...
Read more »

How to fast fail hive jobs

Posted on 2018-11-15 | Edited on 2020-08-04 | In big data |
backgroundWhen running hive jobs in hadoop clusters on mapreduce, we always set the limitation of how much local and hdfs disk a job can use at most. ...
Read more »
12

Wang Yan

20 posts
2 categories
6 tags
© 2021 Wang Yan
Powered by Hexo v3.8.0
|
Theme – NexT.Mist v6.5.0