Programming Breaks: August 2013

Saturday, August 24, 2013

VS IDE Post-build Event: Copy Files

The following line (specified in the "Post-build event command line" under "Build Events" of the project property) will copies the dll generated in a Visual Studio project to a folder "Libs" in the solution directory:

xcopy $(TargetDir)*.dll "$(SolutionDir)Libs" /D /Y

Saturday, August 17, 2013

Calculate Geo Distance in SQL query

The follow SQL query computes the geo distance between two points ([LAT_CENTER], [LON_CENTER]) and ([LAT1], [LON1]), where ([LAT1], [LON1]) is taken from a datatable zipcensus:

SELECT ACOS(COS(centerlatrad)*COS(latrad)*COS(centerlongrad - longrad) + SIN(centerlatrad)*SIN(latrad)) * radius as distcirc FROM (SELECT latitude-([LAT_CENTER]) as difflat, longitude - ([LON_CENTER]) as difflong, latitude*PI()/180 as latrad, [LAT_CENTER]*PI()/180 as centerlatrad, longitude*PI()/180 as longrad, [LON_CENTER]*PI()/180 as centerlongrad, 3949.9 as radius FROM zipcensus zc WHERE latitude = [LAT1] AND longitude = [LON1]) zc) zc

The follow SQL query computes the top 10 nearby points to ([LAT_CENTER], [LON_CENTER])

SELECT TOP 10 zc.*, ACOS(COS(centerlatrad)*COS(latrad)*COS(centerlongrad - longrad) + SIN(centerlatrad)*SIN(latrad)) * radius as distcirc FROM (SELECT latitude-([LAT_CENTER]) as difflat, longitude - ([LON_CENTER]) as difflong, latitude*PI()/180 as latrad, [LAT_CENTER]*PI()/180 as centerlatrad, longitude*PI()/180 as longrad, [LON_CENTER]*PI()/180 as centerlongrad, 3949.9 as radius FROM zipcensus zc) zc) zc ORDER BY distcirc

The formula uses several trigonometric functions, so the innermost query converts all the latitudes and longitudes to radians. In addition, this method uses the radius of the earth, which is taken to be 3,949.9 miles, the unit of the distance returned (i.e. distcirc) is in miles

Thursday, August 15, 2013

SQLCE query alternative for SQL Server SQL

In SQL Server, you can use convenient features such as COUNT(DISTINCT db_field) or YEAR(orderdate). However, SQL functions such as COUNT(DISTINCT ...) and YEAR are not available for SQL Server Compact Edition, therefore, one has to write alternatives to get them to work. below are the example.

SQLCE Alternative to COUNT(DISTINCT ..)

//In SQL Server, will not work in SQLCE 
string query_SQLServer="SELECT COUNT(DISTINCT householdid) FROM customer c JOIN order o ON c.customerid=o.customerid";

//In SQLCE, will work for both SQL Sever and SQLCE
string query_SQLCE="SELECT COUNT(*) FROM (SELECT DISTINCT householdid FROM customer c JOIN order o ON c.customerid=o.customerid) a";

SQLCE Alternative to YEAR()

//In SQL Server, will not work in SQLCE 
string query2_SQLServer="SELECT YEAR(orderdate) FROM customer c JOIN order o ON c.customerid=o.customerid";

//In SQLCE, will work for both SQL Sever and SQLCE
string query2_SQLCE="SELECT DATEPART(year, orderdate) FROM customer c JOIN order o ON c.customerid=o.customerid";

SQLCE Alternative to SQUARE()

//In SQL Server, will not work in SQLCE 
string query_SQLCE="SELECT SQUARE(amount) as amountsqr FROM orders";

//In SQLCE, will work for both SQL Sever and SQLCE
string query_SQLCE="SELECT POWER(amount, 2) as amountsqr FROM orders";

Tuesday, August 13, 2013

How to return the list of column names about a datatable from SQL Compact Edition Database file

Suppose you have a datatable called "calendar" inside your SQLCE database file, but you don't know about the column names of the database, the following code can be used to retrieve the column headers:

string query="SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'calendar'";
string conn_string="[YOUR DB CONNECTION STRING]";
try
{
 using (SqlCeConnection conn = new SqlCeConnection(conn_string))
 {
  conn.Open();
  using (SqlCeCommand cmd = new SqlCeCommand(query, conn))
  {
   SqlCeDataReader reader=cmd.ExecuteReader();
   while (reader.Read())
   {
    int fc=reader.FieldCount;
    for (int i = 0; i < fc; ++i)
    {
     Console.WriteLine("DB Column Header: {0}", reader.GetValue(i));
    }
   }
  }
  conn.Close();
 }
}
catch (Exception ex)
{
 Console.WriteLine("Error executing: {0}", query);
 Console.WriteLine(ex.ToString());
}

One interesting observation is that if one has a column named "national" within the "calendar" table and tries to retrieve it using "SELECT national from calendar", the query will fail as it seems that "national" may be a keyword in the SQLCE (which i am not sure at this point).

Monday, August 12, 2013

Increase the SQL Compact Edition database file size

By default the size of the sqlce database file is only 2GB, which may not be enough for some application. the following shows the connection string which allows to increase the file size to 4GB.

using System.Data.SqlServerCe;
using System.Data;
using System.IO;

string db_file_path="[YOUR DB FILE LOCATION]"; 
string db_pass = "[YOUR DB PASSWORD]";
int db_size_in_MB = 4000; //The absolute maximum size of the database is 4gb.
string conn_string = string.Format("DataSource=\"{0}\";Password=\"{1}\";Max Database Size={2}", db_file_path, db_pass, db_size_in_MB);

if(!File.Exists(db_file_path))
{
 SqlCeEngine en = new SqlCeEngine(conn_string);
 en.CreateDatabase();
}

string query= @"CREATE TABLE [product](
         [PRODUCTID] [int] NULL,
         [PRODUCTNAME] [nvarchar](50) NULL,
         [PRODUCTGROUPCODE] [nvarchar](50) NULL,
         [PRODUCTGROUPNAME] [nvarchar](50) NULL,
         [INSTOCKFLAG] [nvarchar](50) NULL,
         [FULLPRICE] [money] NULL
        ) ";

try
{
 using (SqlCeConnection conn = new SqlCeConnection(conn_string))
 {
  conn.Open();
  using (SqlCeCommand cmd = new SqlCeCommand(query, conn))
  {
   cmd.ExecuteNonQuery();
  }
  conn.Close();
 }
}
catch (Exception ex)
{
 Console.WriteLine("Error executing: {0}", query);
 Console.WriteLine(ex.ToString());
}

Sunday, August 11, 2013

C# to MS SQL Compact Edition Data Type Mapping

SQLCE DataType	C# DataType
numeric	decimal
money	decimal
smallint	Int16
int	Int32
bigint	Int64
float	float
double	double
nvarchar, nchar	string

Bulk Insert into MS SQL Compact Edition Database from CSV files using C#

In MS SQL Server, bulk insert into database from CSV files can be easily done by calling 'LOAD' command from SQL query. However, this command is not supported in MS SQL compact edition. By inserting line by line into sqlce database is obviously too slow for most application that consumes large amount of data. Below is a source code written in C# for sqlce which allows users to obtain performance similar to bulk insert in MS SQL Server

public void BulkInsertFromCSV(string file_path, string table_name, string connection_string, Dictionary<string, Type> data_types)
{
 string line;
 List<string> column_names = new List<string>();
 using (StreamReader reader = new StreamReader(file_path))
 {
  line = reader.ReadLine();
  string[] texts = line.Split('\t');
  foreach (string txt in texts)
  {
   column_names.Add(txt);
  }

   using (SqlCeConnection conn = new SqlCeConnection(connection_string))
   {
    SqlCeCommand cmd = new SqlCeCommand();
    SqlCeResultSet rs;
    SqlCeUpdatableRecord rec;
    conn.Open();
    cmd.Connection = conn;
    cmd.CommandText = table_name;
    cmd.CommandType = CommandType.TableDirect;

    rs = cmd.ExecuteResultSet(ResultSetOptions.Updatable);
    

    while ((line = reader.ReadLine()) != null)
    {
     texts = line.Split('\t');

     rec = rs.CreateRecord();

     for (int j = 0; j < column_names.Count; ++j)
     {
      string columnName = column_names[j];
      
      int ordinal = rec.GetOrdinal(columnName);
      string param_value = "";
      if (j < texts.Length)
      {
       param_value = texts[j];
      }

      Type data_type=data_types[columnName];
      if (data_type == typeof(Int32))
      {
       Int32 value=0;
       int.TryParse(param_value, out value);
       rec.SetInt32(ordinal, value);
      }
      else if (data_type == typeof(Int64))
      {
       Int64 value = 0;
       Int64.TryParse(param_value, out value);
       rec.SetInt64(ordinal, value);
      }
      else if (data_type == typeof(Int16))
      {
       Int16 value = 0;
       Int16.TryParse(param_value, out value);
       rec.SetInt16(ordinal, value);
      }
      else if (data_type == typeof(string))
      {
       rec.SetString(ordinal, param_value);
      }
      else if (data_type == typeof(double))
      {
       double value=0;
       double.TryParse(param_value, out value);
       rec.SetDouble(ordinal, value);
      }
      else if (data_type == typeof(float))
      {
       float value = 0;
       float.TryParse(param_value, out value);
       rec.SetFloat(ordinal, value);
      }
      else if (data_type == typeof(DateTime))
      {
       DateTime value;
       if(DateTime.TryParse(param_value, out value))
           rec.SetDateTime(ordinal, value);
      }

      else if (data_type == typeof(decimal))
      {
         decimal value;
         decimal.TryParse(param_value, out value);
         rec.SetDecimal(ordinal, value);
      }
     }

     rs.Insert(rec);
    }

    rs.Close();
    rs.Dispose();
    cmd.Dispose();
   
  }
 }
}

As can seen in the code above, the parameters for the method include:

file_path: full path to the CSV file, which is assumed to be tab-separated and has a header column which has the same names as the database column names
table_name: name of the database table into which the CSV file data will be populated
connection_string: the database connection string
data_types: a data type holder of Dictionary<string, Type>, for which the key specifies the name of the datatable column while the type specifies the type of data held by the column

The usage of the method is very simple, support the datatable schema looks like the following:

CREATE TABLE [product](
 [PRODUCTID] [int] NULL,
 [PRODUCTNAME] [nvarchar](50) NULL,
 [PRODUCTGROUPCODE] [nvarchar](50) NULL,
 [PRODUCTGROUPNAME] [nvarchar](50) NULL,
 [INSTOCKFLAG] [nvarchar](50) NULL,
 [FULLPRICE] [money] NULL
)

And the CSV file C:\temp.csv looks like the following

PRODUCTID PRODUCTNAME PRODUCTGROUPCODE PRODUCTGROUPNAME INSTOCKFLAG FULLPRICE
10001  CA CALENDAR N 15
10002  CA CALENDAR N 10
10003  CA CALENDAR N 10
10004  BK BOOK Y 90
10005  BK BOOK Y 16
10006  BK BOOK N 10
10007  BK BOOK N 7
10008  BK BOOK N 11

The invoking of the method will be something like:

string connection_string="[YOUR DB CONNECTION STRING]";
Dictionary<string, Type> data_types=new Dictionary<string, Type>();
data_types["PRODUCTID"]=typeof(Int32);
data_types["PRODUCTNAME"]=typeof(string);
data_types["PRODUCTGROUPCODE"]=typeof(string);
data_types["INSTOCKFLAG"]=typeof(string);
data_types["FULLPRICE"]=typeof(float);
BulkInsertFromCSV(@"C:\temp.csv", "product", connection_string, data_types);

One more thing, i found that there is a bug that prevents insertion of float value to the datatable. In my implementation, the .net complains that "cannot insert Single to column (xxx). The column type is Float". Although i already specify using "SetFloat" but the errors simply do not go away, therefore, currently I am using "SetValue(ordinal, value)" in place of "SetFloat(ordinal, value)" which kind of solving my problem.

Javascript Video Player to play AVI, WMV, and other formats such as MP4

There are quite a number of javascript-based video player which leverage either HTML5 or flash player to player video files from the server. However, majority of them does not support playback of file format such as AVI and WMV. This post shows how one can create a simple page in javascript and PHP that support playback of most video formats including AVI and WMV.

Step 1: Download Flowplayer
Download the flow player from the following link:

http://flowplayer.org/download/

unzip the flowplayer to the folder "flowplayer" below your application root directory.

Step 2: Implement PHP video player page

Implement a video player page in PHP that playback the video, the source code of the PHP is shown below:

<?php

$folder_name=$_GET['dirno'];
$video_name=$_GET['vid'];

$path_parts = pathinfo($video_name);
$video_format= $path_parts['extension']; 
?>

<!DOCTYPE html>
<html>
<head>
  <title>Meme Analytics Video Demos</title>

  <?php if($video_format=='avi' || $video_format=='wmv'):?>
  <?php else: ?>
   <!-- player skin -->
   <link rel="stylesheet" type="text/css" href="flowplayer/skin/minimalist.css">

   <!-- site specific styling -->
   <style type="text/css">
   body { font: 12px "Myriad Pro", "Lucida Grande", sans-serif; text-align: center; padding-top: 5%; }
   .flowplayer { width: 80%; }
   </style>

   <!-- flowplayer depends on jQuery 1.7.1+ (for now) -->
   <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script>

   <!-- include flowplayer -->
   <script type="text/javascript" src="flowplayer/flowplayer.min.js"></script>
  <?php endif; ?>

</head>
<body>
  <?php if($video_format=='avi' || $video_format=='wmv'):?>
  <object classid="clsid:67DABFBF-D0AB-41fa-9C46-CC0F21721616" width="320" height="260" codebase="http://go.divx.com/plugin/DivXBrowserPlugin.cab">

  <param name="custommode" value="none" />

   <param name="previewImage" value="image.png" />
   <param name="autoPlay" value="false" />
   <param name="src" value="<?php echo $folder_name.'/'.$video_name; ?>" />

 <embed type="video/divx" src="<?php echo $folder_name.'/'.$video_name; ?>" custommode="none" width="320" height="260" autoPlay="false"  previewImage="image.png"  pluginspage="http://go.divx.com/plugin/download/">
 </embed>
 </object>
 <br />No video? <a href="http://www.divx.com/software/divx-plus/web-player" target="_blank">Download</a> the DivX Plus Web Player.
  <?php else: ?>
  <!-- the player -->
   <div class="flowplayer" data-swf="flowplayer/flowplayer.swf" data-ratio="0.4167">
      <video>
          <source src="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>" type='video/<?php echo $video_format; ?>' />
         
      </video>
   </div>
  <?php endif; ?>
</body>
<html>

The PHP player page accept to parameters from the URL: dirno and vid, which are the directory name and the video filename, it then checks the extension of the video file, if the file is a AVI or WMV file, it invokes the DIVX web player, otherwise, the flowplayer is used to play the video.

Step 3: Use VLC instead of DIVX to player AVI online

In case the DIVX web player does not work with your AVI files, you can also try other options, below is the same PHP script rewritten to use VLC player for AVI.

<?php

$folder_name=$_GET['dirno'];
$video_name=$_GET['vid'];

$path_parts = pathinfo($video_name);
$video_format= $path_parts['extension']; 
?>

<!DOCTYPE html>
<html>
<head>
  <title>Video Demos</title>

  <?php if($video_format=='avi' || $video_format=='wmv'):?>
  <?php else: ?>
   <!-- player skin -->
   <link rel="stylesheet" type="text/css" href="flowplayer/skin/minimalist.css">

   <!-- site specific styling -->
   <style type="text/css">
   body { font: 12px "Myriad Pro", "Lucida Grande", sans-serif; text-align: center; padding-top: 5%; }
   .flowplayer { width: 80%; }
   </style>

   <!-- flowplayer depends on jQuery 1.7.1+ (for now) -->
   <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script>

   <!-- include flowplayer -->
   <script type="text/javascript" src="flowplayer/flowplayer.min.js"></script>
  <?php endif; ?>

</head>
<body>
  <?php if($video_format=='avi' || $video_format=='wmv'):?>
  
  <object type="application/x-vlc-plugin" pluginspage="http://www.videolan.org" data="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>" width="400" height="300" id="video1">
     <param name="movie" value="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>"/>
     <embed type="application/x-vlc-plugin" name="video1"
     autoplay="no" loop="no" width="400" height="300"
     target="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>" />
  <br />
     <a href="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>">Download Video</a>
  <br />
   Video Not Played? download <a href="http://www.videolan.org" target="_blank">VLC Player</a> 
</object>

  
  <?php else: ?>
  <!-- the player -->
   <div class="flowplayer" data-swf="flowplayer/flowplayer.swf" data-ratio="0.4167">
      <video>
          <source src="http://xxx-xxx.com/demo/videos/<?php echo $folder_name.'/'.$video_name; ?>" type='video/<?php echo $video_format; ?>' />
      </video>
   </div>
  <?php endif; ?>
</body>
</html>

Include Latex in Blogger

The original post is at:

http://webapps.stackexchange.com/questions/13053/mathjax-on-blogger

The basic step is to include the following javascript in the header section of the template (which can be accessed by Design→Edit HTML→Edit Template)

<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js">
MathJax.Hub.Config({
 extensions: ["tex2jax.js","TeX/AMSmath.js","TeX/AMSsymbols.js"],
 jax: ["input/TeX", "output/HTML-CSS"],
 tex2jax: {
     inlineMath: [ ['$','$'], ["\\(","\\)"] ],
     displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
 },
 "HTML-CSS": { availableFonts: ["TeX"] }
});
</script>

With the body, the blogger can write something like the one below:

$f(x, y)=\frac{x^2}{y}$

which will be turned into:

$f(x, y)=\frac{x^2}{y}$

Criteria to find the best rules in association rule mining

Criteria 1: Support Level
In association rule mining (ARM), we can rank the association rules by their support level. In this way, for the combination of two product or items, we may measure the fitness of an association rule (Ci, Nj) by its support level in the database Support(Ci, Nj).

Support(Ci, Nj) can be measured as ActualCount(Ci, Nj) or ActualCount(Ci, Nj) / TotalCount

Criteria 2: Chi-Square
However, from statistics, we know that it is possible the high support level may be by chance. Therefore, we want to know that the high support level, Support(Ci, Nj), of an association rule is a true phenomenon instead of being randomly created by chance. To this end, we can measure the ChiSquare(Ci, Nj). The details of ChiSquare(Ci, Nj) calculation can be found in the following blog:

http://czcodezone.blogspot.sg/2013/08/three-approaches-to-measure-how.html

The higher the ChiSquare(Ci, Nj), the least likely that Support(Ci, Nj) is a random sample by chance. With this, if we combine the Support(Ci, Nj) and ChiSquare(Ci, Nj), we can find the best rules by selecting association rules with both high Support(Ci, Nj) and ChiSquare(Ci, Nj).

Criteria 3: Lift
Another criteria is lift, which can be measure by:

Lift(Ci, Nj) = [ActualCount(Ci, Nj) / ActualCount(Ci)] / [ActualCount(Nj) / TotalCount]

Lift tells us how much better the rule does rather than guessing, the higher Life(Ci, Nj), the better the rule (Ci, Nj)

Criteria 4: Confidence
Confidence is measured by:

Confidence(Ci, Nj) = ActualCount(Ci, Nj) / ActualCount(Ci)

It measure how often the rule (Ci, Nj) is true, given the Ci is true, the higher Confidence(Ci, Nj), the better the rule

Three approaches to measure how different are the different samples

Approach 1: Measure how far a random sample's average value is from the average of another sample.

Given the following two samples:

Distribution(sample 1) is ~N(mu1, sigma1)
Distribution(sample 2) is ~N(mu2, sigma2)

In above, sample 1 may be a random sample drawn from sample 2.Now suppose we want to know how far the sample 1's average value mu1 is from sample sample 2's average. The difference can be measured by the following equation.

<difference>=2 * MIN( 1- NORMSDIST(<z-score>), NORMSDIST(<z-score>))

where <z-score> = (mu1 - mu2) / sigma2. NORMSDIST(x) measure the area below the normal distribution curve from negative infinity to x. MIN(x, y) returns the minimum of (x, y).

Approach 2: Chi-Square Test

The chi-square test provides another method for addressing "how different is different?". The chi-square test is appropriate when there are multiple dimensions being compared to each other. In other words, two samples are compared across different dimensions. The chi-square test does not create confidence intervals, because confidence intervals do not make as much sense across multiple dimensions.

Suppose we have two samples by dividing an overral sample: sample1 and sample2. Sample1 has N1 items, and sample2 has N2 items. Each sample can divided into sub-groups: C1, C2, C3 using a single dimension (or feature), we have

C1(sample1)+C2(sample1)+C3(sample1)=N1
C1(sample2)+C2(sample2)+C3(sample2)=N2

C1(sample1)+C1(sample2)=C1
C2(sample1)+C2(sample2)=C2
C3(sample1)+C3(sample2)=C3

From above, we will have C1+C2+C3 = N1+N2

Now we calculate the expected value of C1(sample1), ... C3(sample2) as follows:

E[C1(sample1)]=(C1 * N1) / (N1+N2)
E[C2(sample1)]=(C2 * N1) / (N1+N2)
E[C3(sample1)]=(C3 * N1) / (N1+N2)
E[C1(sample2)]=(C1 * N2) / (N1+N2)
E[C2(sample2)]=(C2 * N2) / (N1+N2)
E[C3(sample2)]=(C3 * N2) / (N1+N2)

Next we calculate the deviation of the actual value (e.g. C1(sample1)) from the expected value (e.g. E[C1(sample1))], as follows:

D[C1(sample1)]=C1(sample1) - E[C1(sample1)]
....
D[C3(sample2)]=C3(sample2) - E[C3(sample2)]

Next we calculate the chi-square values, as follows:

ChiSquare[C1(sample1)]=D[C1(sample1)]^2 / E[C1(sample1)]

With the chi-squares value calculated, we can calculate the chi-square for each sample, as follows:

ChiSquare(sample1)=ChiSquare[C1(sample1)]+ChiSquare[C2(sample1)]+ChiSquare[C3(sample1)]
ChiSquare(sample2)=ChiSquare[C1(sample2)]+ChiSquare[C2(sample2)]+ChiSquare[C3(sample2)]

Note that the above calculation can be extended to arbitrary number of samples, each sample may be a sub-category (or sub-set) of an overall sample. To generalize this, given the overall sample is measured into two different dimensions or categories, with the first dimension measured in three different values (C1, C2, C3) and second dimension measured in two different values (N1, N2), then the chi-square for each cell (e.g. the cell (C1, N1)) can be calculated as

ChiSquare(C1, N1)=D(C1, N1)^2 / E(C1, N1)

Where
D(C1, N1)=ActualCount(C1, N1) - E(C1, N1)
E(C1, N1)=(ActualCount(C1)*ActualCount(N1)) / TotalCount)

TotalCount=(ActualCount(N1)+ActualCount(N2)=ActualCount(C1)+ActualCount(C2)+ActualCount(C3)
ActualCount(C1)=ActualCount(C1, N1)+ActualCount(C1, N2)
ActualCount(N1)=ActualCount(C1, N1)+ActualCount(C2, N1)+ActualCount(C3, N1)

The higher the value of ChiSquare(C1, N1), the further the actual count of combination (C1, N1) deviates from the expected count of (C1, N1). In other words, the higher the ChiSquare(C1, N1), the more unexpected the combination of (C1, N1) is. Or we can say, the higher the ChiSquare(C1, N1), the least likely that ActualCount(C1, N1) occurs by chance from the overall sample distribution (i.e. its distribution is more significantly different from the overall sample).

The chi-square values follows the distribution known as chi-square distribution. The degree of freedom for the chi-square distribution, k, is measured by the product of one-minus-count of each dimension, for example, the dimension 1 (C1, C2, C3) has 3 category, dimension 2 (N1, N2) has 2 category, the k value is therefore:

k=(3-1) * (2-1)=3

Approach 3: Student-t Test
This is described in another post.

Relationship between confidence level, confidence interval

Suppose a particular phenomenon follows a normal distribution, ~(mu, sigma), two points on the horizontal axis of the normal distribution curve define the ends of the confidence interval, and the area under the curve, between the two points, measures the confidence.

For normal distribution ~(mu, sigma), about 95.5% fall within two standard deviation (sigma * 2). By tradition, statistical significance is often taken at the 95% level, and this occurs at the (1.96 * sigma) deviations from the average.

The distance from a value to the average, measured in standard deviation, is called the z-score.

z-score=(value - mu) / sigma

using z-score, the confidence level can be computed by:

<confidence level>=NORMSDIST(<z-score>) - NORMSDIST(-<z-score>)

where NORMSDIST(<z-score>) measures the area from negative inifiity to <z-score> under the normal distribution curve.

The 95% confidence level occurs at confidence interval (z-score=-1.96, z-score=1.96). The confidence of 95% implies that the confidence interval ranges 1.96 times the standard deviaion on either side of the average. In other words, the confidence interval for the phenomenon (e.g. a poll) with 95% confidence ranges from (mu - 1.96 * sigma) to (mu + 1.96 * sigma)

Saturday, August 10, 2013

Convert query from MS SQL Server to MS SQL Server Compact Edition 3.5

Issue 1: Data Type
In MS SQL Server, when creating a datatable, user can specify data type [char], [smalldatetime] and [varchar] as shown in the example below:

CREATE TABLE [product](
 [PRODUCTID] [int] NULL,
 [PRODUCTNAME] [varchar](50) NULL,
 [PRODUCTGROUPCODE] [varchar](50) NULL,
 [PRODUCTGROUPNAME] [varchar](50) NULL,
 [INSTOCKFLAG] [char](50) NULL,       
 [FULLPRICE] [money] NULL
)

However, when port the above query to MS SQL Server Compact, the data types [char] and [varchar] must be changed to [nchar] and [nvarchar] (i.e. unicode-based [char] and [varchar], and the [smalldatetime] must be changed to [datetime] as shown in the example below:

CREATE TABLE [product](
 [PRODUCTID] [int] NULL,
 [PRODUCTNAME] [nvarchar](50) NULL,
 [PRODUCTGROUPCODE] [nvarchar](50) NULL,
 [PRODUCTGROUPNAME] [nvarchar](50) NULL,
 [INSTOCKFLAG] [nchar](50) NULL,
 [FULLPRICE] [money] NULL
)

Issue 2:

Thursday, August 8, 2013

C#: Replace in Files

Below shows the method that replace a text in files within a directory which is iteratively searched:

public void ReplaceInFiles(string dir_path, Dictionary<string, Dictionary<int, string>> results, string old_text, string new_text, string extension=null)
{
 DirectoryInfo dir = new DirectoryInfo(dir_path);
 DirectoryInfo[] sub_dirs = dir.GetDirectories();
 FileInfo[] files = dir.GetFiles();

 foreach (FileInfo file in files)
 {
  string file_path=file.FullName;
  if (!string.IsNullOrEmpty(extension) && file.Extension != extension) continue;

  Dictionary<int, string> found_locations = null;
  using (StreamReader reader = new StreamReader(file.OpenRead()))
  {
   string line;
   int line_index=0;
   while ((line = reader.ReadLine()) != null)
   {
    line_index++;
    if (line.Contains(old_text))
    {
     if (results.ContainsKey(file_path))
     {
      found_locations = results[file_path];
     }
     else
     {
      found_locations = new Dictionary<int, string>();
      results[file_path] = found_locations;
     }
     found_locations[line_index] = line;
    }
   }
  }

  if(found_locations!=null)
  {
   string file_content=null;
   using(StreamReader reader =new StreamReader(file.OpenRead()))
   {
    file_content=reader.ReadToEnd();
   }
   file_content=file_content.Replace(old_text, new_text).TrimEnd(Environment.NewLine.ToCharArray());
   using(StreamWriter writer=new StreamWriter(file.OpenWrite()))
   {
    writer.WriteLine(file_content);
    writer.Flush();
   }
  }
  
 }

 foreach (DirectoryInfo sub_dir in sub_dirs)
 {
  string sub_dir_path=sub_dir.FullName;
  ReplaceInFiles(sub_dir_path, results, old_text, new_text, extension);
 }
}

Below is the code snippet which shows how to use the method above:

string dir_path="C:\temp";
string text_to_find="SimuKit.DomainModels\";
string text_to_replace="packages\SimuKit.DomainModels\";
string extension=".sln";
Dictionary<string, Dictionary<int, string>> results=new Dictionary<string, Dictionary<int, string>>();

ReplaceInFiles(dir_path, results, text_to_find, text_to_replace, extension);
foreach (string file_path in results.Keys)
{
 Dictionary<int, string> line_content = results[file_path];
 
 foreach (int line_index in line_content.Keys)
 {
  Console.writeline("{0} is replaced in file {1} in line {2}, original content: {3}", text_to_find, file_path, line_index, line_content[line_index]);
 }
}

C#: Find in Files

The following method returns the list of files in a directory and the line location within the files that contains a particular text string. it iteratively search the sub directories of the directory.

public void FindInFiles(string dir_path, Dictionary<string, Dictionary<int, string>> results, string text_to_find, string extension=null)
{
 DirectoryInfo dir = new DirectoryInfo(dir_path);
 DirectoryInfo[] sub_dirs = dir.GetDirectories();
 FileInfo[] files = dir.GetFiles();

 foreach (FileInfo file in files)
 {
  string file_path=file.FullName;
  if (!string.IsNullOrEmpty(extension) && file.Extension != extension) continue;

  using (StreamReader reader = new StreamReader(file.OpenRead()))
  {
   string line;
   int line_index=0;
   while ((line = reader.ReadLine()) != null)
   {
    line_index++;
    if (line.Contains(text_to_find))
    {
     Dictionary<int, string> found_locations = null;
     if (results.ContainsKey(file_path))
     {
      found_locations = results[file_path];
     }
     else
     {
      found_locations = new Dictionary<int, string>();
      results[file_path] = found_locations;
     }
     found_locations[line_index] = line;
    }
   }
  }
 }

 foreach (DirectoryInfo sub_dir in sub_dirs)
 {
  string sub_dir_path=sub_dir.FullName;
  FindInFiles(sub_dir_path, results, text_to_find, extension);
 }
}

The following code snippet shows how to use this method:

string dir_path="C:\temp";
string text_to_find="SimuKit.DomainModels\";
string extension=".sln";
Dictionary<string, Dictionary<int, string>> results=new Dictionary<string, Dictionary<int, string>>();

FindInFiles(dir_path, results, text_to_find, extension);
foreach (string file_path in results.Keys)
{
 Dictionary<int, string> line_content = results[file_path];
 
 foreach (int line_index in line_content.Keys)
 {
  Console.writeline("{0} appears in file {1} in line {2}, content: {3}", text_to_find, file_path, line_index, line_content[line_index]);
 }
}

Monday, August 5, 2013

Enumerate files in a directory using PHP

<?php

if ($handle = opendir('./files')) 
{
    while (false !== ($entry = readdir($handle))) 
 {
  echo $entry;
    }

    closedir($handle);
}
?>

Data Encryption in PHP

The encryption.inc includes the encryption utility which prevents the user from deciphering the token. The source codes of the encryption.inc is shown below:

<?php

class Encryption {
    var $skey = '[YOUR-SECRET-KEY]';

    public  function safe_b64encode($string) {
        $data = base64_encode($string);
        $data = str_replace(array('+','/','='),array('-','_',''),$data);
        return $data;
    }

    public function safe_b64decode($string) {
        $data = str_replace(array('-','_'),array('+','/'),$string);
        $mod4 = strlen($data) % 4;
        if ($mod4) {
            $data .= substr('====', $mod4);
        }
        return base64_decode($data);
    }

    public  function encode($value){ 
        if(!$value){return false;}
        $text = $value;
        $iv_size = mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256, MCRYPT_MODE_ECB);
        $iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
        $crypttext = mcrypt_encrypt(MCRYPT_RIJNDAEL_256, $this->skey, $text, MCRYPT_MODE_ECB, $iv);
        return trim($this->safe_b64encode($crypttext)); 
    }

    public function decode($value){
        if(!$value){return false;}
        $crypttext = $this->safe_b64decode($value); 
        $iv_size = mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256, MCRYPT_MODE_ECB);
        $iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
        $decrypttext = mcrypt_decrypt(MCRYPT_RIJNDAEL_256, $this->skey, $crypttext, MCRYPT_MODE_ECB, $iv);
        return trim($decrypttext);
    }
}

?>

To use the above facility, the following is the demo

$converter = new Encryption;
$raw='Hello World';
$encoded = $converter->encode($raw);
$decoded = $converter->decode($encoded);

Send HTML Email in PHP

ini_set("SMTP","[YOUR-EMAIL-SERVER-ADDRESS]");  // this line 184 for me  

$to      = '[YOUR-TO-EMAIL-ACCOUNT]';
$subject = 'This is the subject line';
$message = 'Hello World!

';
$message .='';
 
$headers = "MIME-Version: 1.0" . "\r\n";
$headers .= "Content-type:text/html;charset=iso-8859-1" . "\r\n";

$headers .= 'From: <from@example.com>' . "\r\n";
$headers .= 'Cc: cc@example.com' . "\r\n";

mail($to, $subject, $message, $headers);

A simple tutorial on creating software registration and download page with encrypted one day valid token using PHP

This tutorial shows how to create a registration page that will automatically send user an automatically generated software download link using encrypted one day valid token after user register. For simplicity, the database logging is omitted.

Step 1: Create a registration form page
Create a simple registration form page registration.htm as in the follows:

<html>
<body>
Software Application Download Registration <br>

<form id="regForm" action="submit.php" method="post">

<table>
  <tbody>
  <tr>
    <td><label for="fname">First Name:</label></td>
    <td><div class="input-container"><input name="fname" id="fname" type="text" /></div></td>
  </tr>
  <tr>
    <td><label for="lname">Last Name:</label></td>
    <td><div class="input-container"><input name="lname" id="lname" type="text" /></div></td>
  </tr>
  <tr>
    <td><label for="email">Your Email:</label></td>
    <td><div class="input-container"><input name="email" id="email" type="text" /></div></td>
  </tr>
  <tr>
    <td><label for="company">Company:</label></td>
    <td><div class="input-container"><input name="company" id="company" type="text" /></div></td>
  </tr>
  <tr>
  <td> </td>
  <td><input type="submit" class="greenButton" value="Register and Download" /><img id="loading" src="img/ajax-loader.gif" alt="working.." />
</td>
  </tr>
  
  
  </tbody>
</table>

</form>
</body>
</html>

Step 2: Create registration submission processing server script in PHP
Create the submit.php page which processes the registration information from the registration.htm page. the submit.php page sends the registered user an email with a link to download the software, the link is valid for the current date (using the concept of token), the submit.php also cc the email to [YOUR-EMAIL-CC-ACCOUNT] (As this is a simple demo, the validation of registration information is not included here)

<?php

require_once('encryption.inc');

$token="dfjsl@ew".date('Ymd')."oqiej".date('Y')."!#@#@1".date('m')."2331";

$converter = new Encryption;
$encoded = $converter->encode($token);

$mail_server="[YOUR-MAIL-SERVER-IP-ADDRESS]";

ini_set("SMTP",$mail_server);  

$to      = $_POST['email'];
$cc     = '[YOUR-EMAIL-CC-ACCOUNT]';

$subject = 'XXX Application Download';
$message = '<html><body><h2>User registration information: </h2><br />';
$message .= '<table>';
$message .='<tr><td><b>First Name:</b></td><td>'.$_POST['fname'].'</td></tr>';
$message .='<tr><td><b>Last Name: </b></td><td>'.$_POST['lname'].'</td></tr>';
$message .='<tr><td><b>Email: </b></td><td>'.$_POST['email'].'</td></tr>';
$message .='<tr><td><b>Company: </b></td><td>'.$_POST['company'].'</td></tr>';
$message .='</table>';

$elink = '[YOUR-WEBSITE-DOMAIN-AND-DIRECTORY]/download.php?token='.urlencode($encoded).'&attachment=vrpmac';
$message .= 'Thank you for your interest. Below is a one-day expired download link to the software.<br/>';
$message .= '<a href="'.$elink.'">'.'Download'.'</a><br>';
 
$message .='</body></html>';
 
$headers = "MIME-Version: 1.0" . "\r\n";
$headers .= "Content-type:text/html;charset=iso-8859-1" . "\r\n";

$headers .= 'From: <[YOUR-EMAIL-FROM-ACCOUNT]>' . "\r\n";
$headers .= 'Bcc: ' . $cc . "\r\n";

mail($to, $subject, $message, $headers);

echo msg(1,"registered.php?token=".urlencode($encoded).'&email='.urlencode($_POST['email']));


function msg($status,$txt)
{
 return '{"status":'.$status.',"txt":"'.$txt.'"}';
}
?>

The $token variable contains a token which makes the software download page generated valid for one day, to increase the security you may want to create more sophisticated token using encryption and random numbers.

The [YOUR-MAIL-SERVER-IP-ADDRESS] refers to your mail server address. The [YOUR-EMAIL-CC-ACCOUNT] refers to a cc account when the registration email is sent to the registered user. The [YOUR-WEBSITE-DOMAIN-AND-DIRECTORY] is the root directory which contains the download.php page (which is the page that provides the software download service), The [YOUR-EMAIL-FROM-ACCOUNT] is the from field in the email sent to the registered user.

The encryption.inc includes the encryption utility which prevents the user from deciphering the token. The source codes of the encryption.inc is shown below:

<?php

class Encryption {
    var $skey = '[YOUR-SECRET-KEY]';

    public  function safe_b64encode($string) {
        $data = base64_encode($string);
        $data = str_replace(array('+','/','='),array('-','_',''),$data);
        return $data;
    }

    public function safe_b64decode($string) {
        $data = str_replace(array('-','_'),array('+','/'),$string);
        $mod4 = strlen($data) % 4;
        if ($mod4) {
            $data .= substr('====', $mod4);
        }
        return base64_decode($data);
    }

    public  function encode($value){ 
        if(!$value){return false;}
        $text = $value;
        $iv_size = mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256, MCRYPT_MODE_ECB);
        $iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
        $crypttext = mcrypt_encrypt(MCRYPT_RIJNDAEL_256, $this->skey, $text, MCRYPT_MODE_ECB, $iv);
        return trim($this->safe_b64encode($crypttext)); 
    }

    public function decode($value){
        if(!$value){return false;}
        $crypttext = $this->safe_b64decode($value); 
        $iv_size = mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256, MCRYPT_MODE_ECB);
        $iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
        $decrypttext = mcrypt_decrypt(MCRYPT_RIJNDAEL_256, $this->skey, $crypttext, MCRYPT_MODE_ECB, $iv);
        return trim($decrypttext);
    }
}

?>

The registered.php is a page displayed to the user after user submit the registration form. The page informs user and he has successfully registered (You can include the download link here if you like). The page checks whether the token is included in the url, if not or if the token is not valid, then the user is redirected to the registration.htm page. The source code of the registered.php is as below:

<?php
 require_once('encryption.inc');
 
 if(!isset($_GET['token']))
 {
  header( 'Location: registration.htm' ) ;
 }
 $message="dfjsl@ew".date('Ymd')."oqiej".date('Y')."!#@#@1".date('m')."2331";
 $converter = new Encryption;
 $encode=$_GET['token'];
 $decoded = $converter->decode($encode);  
 if(strcmp($decoded, $message) != 0)
 {
  header( 'Location: registration.htm' ) ;
 }
?>


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>You've been registered!</title>
<link rel="stylesheet" type="text/css" href="reg.css">
</head>

<body>

<div class="registered">

<h1>The application link will be sent to your email account <?php echo $_GET['email']; ?></h1>
<br /><br /><br />

<br /><br /><br /><br />
<h2>Thank you for your registration. </h2>
</div>

</body>
</html>

Step 3: Create software download page
Below is the source code for the software download page, the download script first checks whether the token is valid or not, if not then redirect user to the registration page. Otherwise, it sends the software.

<?php


require_once('encryption.inc');
 
if(!isset($_GET['token']))
{
 header( 'Location: registration.htm' ) ;
}
$message="dfjsl@ew".date('Ymd')."oqiej".date('Y')."!#@#@1".date('m')."2331";
$converter = new Encryption;
$encode=$_GET['token'];
//$encoded = $converter->encode($message);
$decoded = $converter->decode($encode);  
if(strcmp($decoded, $message) != 0)
{
 header( 'Location: registration.htm' ) ;
}

if(isset($_GET['attachment']))
{
 if($_GET['attachment']=='vrpmac')
 { 
  $file = 'uyk0293fwie0.zip';
  if (file_exists($file)) 
  {
   header('Content-Description: File Transfer');
   header('Content-Type: application/octet-stream');
   header('Content-Disposition: attachment; filename=painter2.zip');
   header('Content-Transfer-Encoding: binary');
   header('Expires: 0');
   header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
   header('Pragma: public');
   header('Content-Length: ' . filesize($file));
   ob_clean();
   flush();
   readfile($file);
  }
 }
}

?>

Load local html file into the winform browser

string url = @"C:\temp.html";
url = url.Replace(@"\", @"/");
url = url.Replace(@" ", @"%20");
url = "file:///" + url;

webBrowser1.Navigate(url);

Sunday, August 4, 2013

Incorporate Google Translate into your C# program

Step 1: Download and add the Google .Net Client to your C# solution.

Download the google-api-dotnet-client-1.4.0-beta.samples from

https://code.google.com/p/google-api-dotnet-client/wiki/Downloads

Unzip the package and add the "GoogleApis.SampleHelper" and "Translate.TranslateText" projects to your C# solution,

Step 2: Configure your C# project for Google Translate.

Add the following libraries from the "google-api-dotnet-client-1.4.0-beta.samples/Lib" or "google-api-dotnet-client-1.4.0-beta.samples/Lib/ThirdParty" folder to your C# project:

DotNetOpenAuth
Google.Apis
Google.Apis.Authentication.OAuth2
Newtonsoft.Json
System.Net.Http
System.Net.Http.Exteions
System.Net.Http.Primitives
System.Net.Http.WebRequest
Zlib.Portable
System.Threading.Tasks

Add "System.Net" and "GoogleApis.SampleHelper" references to your C# project as well.

Next copy the ""Google.Apis.Translate.v2.cs" from the "Translate.TranslateText" project to your C# project.

Next right-click your C# project and selects "Manage NuGet Packages", in the "Manage NuGet Packages" dialog, search and install the following packages:

Microsoft BCL Build Components
Microsoft BCL Portability Package
Microsoft HTTP Client Libraries

Step 3: Activate your google translate services.
You need to activate your google translate service in order to use google translate in your C# program. go to the Google API Console at:

code.google.com/apis/console‎

Select the "Services" menu from the right panel, and turn on the Google translate service from there. now copy your google api access key by selecting the "API Access" from the right panel, and copying the key in the line starting with "API Key:"

Step 4: Implement the coding
Below i provide a simple singleton class for you to use the translation service.

using Google.Apis.Samples.Helper;
using Google.Apis.Services;
using Google.Apis.Translate.v2;
using Google.Apis.Translate.v2.Data;
using TranslationsResource = Google.Apis.Translate.v2.Data.TranslationsResource;

public sealed class TranslationManager
{
 private static TranslationManager mInstance = null;
 private static object mSyncObj = new object();

 public static TranslationManager Instance
 {
  get
  {
   if (mInstance == null)
   {
    lock (mSyncObj)
    {
     mInstance = new TranslationManager();
    }
   }
   return mInstance;
  }
 }

 private TranslationManager()
 {

 }

 private string GetApiKey()
 {
  return "[Your-API-Key]";
 }

 public KeyValuePair<string, string> Translate(string srcText, string target_language = "en")
 {
  string[] text = new string[1] { srcText };
  Dictionary<string, KeyValuePair<string, string>> translation=Translate(text, target_language);
  return translation[srcText];
 }

 public Dictionary<string, KeyValuePair<string, string>> Translate(string[] srcText, string target_language = "en")
 {

  // Create the service.
  var service = new TranslateService(new BaseClientService.Initializer()
  {
   ApiKey = GetApiKey(),
   ApplicationName = "Translate API Your App"
  });

  TranslationsListResponse response = service.Translations.List(srcText, target_language).Execute();
  Dictionary<string, KeyValuePair<string, string>> translations = new Dictionary<string, KeyValuePair<string, string>>();

  int counter = 0;
  foreach (TranslationsResource translation in response.Translations)
  {
   translations[srcText[counter]] = new KeyValuePair<string, string>(translation.TranslatedText, translation.DetectedSourceLanguage);
   counter++;
  }

  return translations;
 }
}

Below is a simple example of how to use the TranslationManager class:

string sourceText = "早上好";
string target_language = "en";

KeyValuePair<string, string> translation=TranslationManager.Instance.Translate(sourceText, target_language);
Console.WriteLine("Translated: {0}", translation.Key);
Console.WriteLine("Source Language: {0}", translation.Value);

Programming Breaks