How to fix curl errors in R

If you encounter “curl” error when installing any packages in  “R”, follow these steps

Install  libcurl-devel 

  1. $ sudo yum install libcurl-devel  (RHEL / Centos)
    $ sudo apt-get install libcurl-devel (Ubuntu)Invoke R in sudo
  2. $ sudo -i R
  3. > install.packages(“forecast”)  or  install.packages(“forecast”, dependencies=True)

Wait for installation to complete and verify installation as follows.

4.  > library(forecast)

Bingo !! all done.

Get Public IP of linux box via ssh

To get public IP  (not local IP) of remote linux box, use the following command.

$ wget -qO- http://ipecho.net/plain | xargs echo

returns the external or public IP of your machine.

 

 

The system administrator has set policies to prevent this installation

This is a common problem when you try to install  .msi files on Windows Server. There is a simple way to go past it.

Step 01 : Open Command prompt as Administrator.

Step 02 : Goto to the installation folder .

Step 03 :  msiexec /i  <nameoffile.msi>

you are all set and done.

 

Iterate files in folder using Spark Scala

This script loops through hdfs files system and reads the first line and writes it to console.  Most part its self explanatory.

This script uses pipeline delimiter  “|” .  Its optional and can be skipped.

import org.apache.hadoop.fs.Path

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
 
val path = "/hdfspath/"
val conf = new Configuration()
val fs = FileSystem.get(conf)
val p = new Path(path)
val ls = fs.listStatus(p)
 
ls.foreach( x => {
val f = x.getPath.toString
println(f)
val content = spark.read.option("delimiter","|").csv(f)
content.show(1)
} )
 
System.exit(0)

Spark Scala Command line Arguments

Its pretty straight forward to add Command Line Arguments to Spark (scala) from shell.

$ ./spark-2.0.0-bin-hadoop2.6/bin/spark-shell -i ~/scalaparam.scala --conf spark.driver.args="param1value  param2value  param3value"

Parameter values are separated by  spaces  (param1value  param2value  param3value)

contents of  scalaparam.scala

val args = sc.getConf.get("spark.driver.args").split("\\s+")
val param1=args(0)
val param2=args(1)
val param3=args(2)
println("param1 passed from shell : " + param1)
println("param2 passed from shell : " + param2)
println("param3 passed from shell : " + param3)
System.exit(0)

The trick is sc.getConf.get(“spark.driver.args”).split(“\\s+”)   splitting the value based on space.  (remember: regular expression)

 

MySQL client ran out of memory

Error Message :

Open Database Connectivity (ODBC) error occurred. state: ‘HY000’. Native Error Code: 2008. [MySQL][ODBC 5.3(a) Driver][mysqld-5.5.5-10.1.22-MariaDB]MySQL client ran out of memory

This error is caused due to large number of rows that are stored in client machine memory. If you are reading data from MYSQL and not performing any seek operation, then you can disable Caching and free up the memory.

Control Panel > Administrative Tools > ODBC   (64 big ODBC client)

C:\Windows\SysWOW64\odbcad32.exe   (32 bit ODBC client)

Open your DSN Name  > Configure > Details > Cursor / Results  and check  “Don’t cache results of forward-only cursors.