Wednesday, August 8, 2018

Getting started Machine Learning Using R on Windows environment Step By Step Process.


This post will explain us, how to install R on windows environment and how to work with Machine learning project using R with simple dataset

First Download R https://cran.r-project.org/bin/windows/base/ from this link.
You can download latest version of R.
1. Once download completes, then install the same in your machine. This is like any other software installation. There is no special instructions required for this.
2. After successful installation we need to setup the path- go to MyComputer-RightClick- environment variables- System variables-
C:\Program Files\R\R-3.5.1\bin


3. After setting up the path, Now we need to start the R
4. Go to command prompt and Type R
5. Now we can see the simple R terminal


6. Now we will understand what is machine learning? and what is datasets?
7. When we are applying machine learning to our own datasets, we are working on a project.
The process of a machine learning project may not be linear, but there are a number of well-known steps:

Define Problem.
Prepare Data.
Evaluate Algorithms.
Improve Results.
Present Results.

8.The best way to really come in terms with a new platform or tool is to work through a machine learning project end-to-end and cover the key steps.
Namely, from loading data, summarizing your data, evaluating algorithms and making some predictions.
Machine Learning using R Step By Step
Now this the time to work simple machine learning program using R and inbuilt dataset called iris

We already installed R and it has started.
Install any default packages using following syntax.


Packages are third party add-ons or libraries that we can use in R.

install.packages("caret")
//While installing the package, after typing the above command , it will ask us for select mirror, you can select default one.
  install.packages(“caret”,dependencies=c(“Depends”,”Suggests”))
  install.packages(“ellipse”)
//Load the package, which we are going to use.
libray(caret)
Load the data from inbuilt data and rename the same using following syntax.
// Attach iris dataset to the current environment
  data(iris)
// Rename iris dataset  to dataset
 dataset <- iris
Now iris data loaded in R and accessible with variable called dataset Now we will create validation dataset. We will split the loaded dataset into two, 80% of which we will use to train our models and 20% that we will hold back as a validation dataset.
//  We create a list of 80% of the rows in the original dataset we can use for training
validation_index <- createDataPartition(dataset$Species, p=0.80, list=FALSE)
//select 20% of the data for validation
validation <- dataset[-validation_index,]
//use the remaining 80% of data to training and testing the models
dataset <- dataset[validation_index,]

          
Now we have training data in the dataset variable and a validation set we will use later in the validation variable. Note that we replaced our dataset variable with the 80% sample of the dataset. 1. dim function We can get a quick idea of how many instances (rows) and how many attributes (columns) the data contains with the dim function.
dim(dataset)
2. Attribute types - Knowing the types is important as it will give us an idea of how to better summarize the data we have and the types of transforms we might need to use to prepare the data before we model it.
sapply(dataset,class)
3. head function used to display the first five rows.
head(dataset)
4. The class variable is a factor. A factor is a class that has multiple class labels or levels
levels(dataset$Species)
5. Class Distribution Let’s now take a look at the number of instances (rows) that belong to each class. We can view this as an absolute count and as a percentage. 6. Summary of each Attribute
    summary(dataset)
Visualize Dataset We now have a seen the basic details about the data. We need to extend that with some visualizations. We are going to look at two types of plots: 1. Univariate plots to better understand each attribute. 2. Multivariate plots to better understand the relationships between attributes. First we will see the Univariate plots, this is for each individual variable. Input attributes x and the output attributes y.
  //Split input and output
    x <- dataset[,1:4]
    y <- dataset[,5]
Given that the input variables are numeric, we can create box and whisker plots of each.
   par(mfrow=c(1,4))
   for(i in 1:4) {
   boxplot(x[,i], main=names(iris)[i])
 }
 
We can also create a barplot of the Species class variable to get a graphical representation of the class distribution (generally uninteresting in this case because they’re even).
plot(y)
This confirms what we learned in the last section, that the instances are evenly distributed across the three class: Multivariate Plots First let’s look at scatterplots of all pairs of attributes and color the points by class. In addition, because the scatterplots show that points for each class are generally separate, we can draw ellipses around them.
featurePlot(x=x,y=y,plot=”ellipse”)
We can also look at box and whisker plots of each input variable again, but this time broken down into separate plots for each class. This can help to tease out obvious linear separations between the classes.
featurePlot(x=x,y=y,plot=”box”)
Next we can get an idea of the distribution of each attribute, again like the box and whisker plots, broken down by class value. Sometimes histograms are good for this, but in this case we will use some probability density plots to give nice smooth lines for each distribution.
// density plots for each attribute by class value
scales <- list(x=list(relation="free"), y=list(relation="free"))
featurePlot(x=x, y=y, plot="density", scales=scales)
Evaluating the Algorithms Set-up the test harness to use 10-fold cross validation. We will split our dataset into 10 parts, train in 9 and test on 1 and release for all combinations of train –test splits. We will also repeat the process 3 times for each algorithm with different splits of the data into 10 groups We are using the metric of “Accuracy” to evaluate models. This is a ratio of the number of correctly predicted instances in divided by the total number of instances in the dataset multiplied by 100 to give a percentage (e.g. 95% accurate). We will be using the metric variable when we run build and evaluate each model next.
control <- tarinControl(method=”csv”,number=10)
     metric <- “Accuarcy”
Build 5 different models to predict species from flower measurements Linear Discriminant Analysis (LDA) Classification and Regression Trees (CART). k-Nearest Neighbors (kNN). Support Vector Machines (SVM) with a linear kernel. Random Forest (RF)
set.seed(7)
fit.lda <- train(Species~., data=dataset, method="lda", metric=metric, trControl=control)
# b) nonlinear algorithms
# CART
set.seed(7)
fit.cart <- train(Species~., data=dataset, method="rpart", metric=metric, trControl=control)
# kNN
set.seed(7)
fit.knn <- train(Species~., data=dataset, method="knn", metric=metric, trControl=control)
# c) advanced algorithms
# SVM
set.seed(7)
fit.svm <- train(Species~., data=dataset, method="svmRadial", metric=metric, trControl=control)
# Random Forest
set.seed(7)
fit.rf <- train(Species~., data=dataset, method="rf", metric=metric, trControl=control)






We reset the random number seed before reach run to ensure that the evaluation of each algorithm is performed using exactly the same data splits. It ensures the results are directly comparable. Select the best model. We now have 5 models and accuracy estimations for each. We need to compare the models to each other and select the most accurate. We can report on the accuracy of each model by first creating a list of the created models and using the summary function.
# summarize accuracy of models
results <- resamples(list(lda=fit.lda, cart=fit.cart, knn=fit.knn, svm=fit.svm, rf=fit.rf))
summary(results)

We can also create a plot of the model evaluation results and compare the spread and the mean accuracy of each model. There is a population of accuracy measures for each algorithm because each algorithm was evaluated 10 times (10 fold cross validation)
dotplot(results)
The results can be summarized. This gives a nice summary of what was used to train the model and the mean and standard deviation (SD) accuracy achieved, specifically 97.5% accuracy +/- 4% How to Predictions using predict and confusion Matrix The LDA was the most accurate model. Now we want to get an idea of the accuracy of the model on our validation set. This will give us an independent final check on the accuracy of the best model. It is valuable to keep a validation set just in case you made a slip during such as overfitting to the training set or a data leak. Both will result in an overly optimistic result. We can run the LDA model directly on the validation set and summarize the results in a confusion matrix.
predictions <- predict(fit.lda,validation)
    confusionMatrix(predictions,validation$Species)
   

Thursday, August 2, 2018

REST, REST Security, REST API Methods, REST annotations



REST - Representational State Transfer

1. REST is Architecture Style implementation

2. REST implemenation is based on Json Over HTTP

3. REST implemented based on simple HTTP protocol

4. REST has better scalability and performance

5. REST permits more data formats like JSON,XML etc..

6. REST emphasizes scalability of component interactions, independent deployments of components.

7. REST is design of HTTP and URI standards

8. REST is follow http methods like GET,POST,PUT,DELETE,PATCH

9. HTTP PATCH requests are to make partial update on a resource.
PUT requests also modify a resource entity so to make more clear –
PATCH method is the correct choice for partially updating an existing resource
and PUT should only be used if we are replacing a resource in it’s entirety.

10. REST impelnetations using JAX-RS and Jersy

11. Annotations of JAX-RS

@Context

Injects information into a class field, bean property, or method parameter

@CookieParam

Extracts information from cookies declared in the cookie request header

@FormParam

Extracts information from a request representation whose content type is application/x-www-form-urlencoded

@HeaderParam

Extracts the value of a header

@MatrixParam

Extracts the value of a URI matrix parameter

@PathParam

Extracts the value of a URI template parameter

@QueryParam

Extracts the value of a URI query parameter

12. HTTP Status codes

200 OK - Response to a successful REST API action. The HTTP method can be GET, POST, PUT, PATCH or DELETE.
400 Bad Request - The request is malformed, such as message body format error.
401 Unauthorized - Wrong or no authentication ID/password provided.
403 Forbidden - It's used when the authentication succeeded but authenticated user doesn't have permission to the request resource.
404 Not Found - When a non-existent resource is requested.
405 Method Not Allowed - The error checking for unexpected HTTP method. For example, the RestAPI is expecting HTTP GET, but HTTP PUT is used.


13. REST security

javax.ws.rs.core.SecurityContext interface to implement security programmatically


   GET
        @Produces("text/plain;charset=UTF-8")
        @Path("/hello")
        public String updateUser(@Context SecurityContext sc) {
                if (sc.isUserInRole("admin"))  return "User will be updated";
                throw new SecurityException("User is unauthorized.");
        }

Applying annotations to your JAX-RS classes

DeclareRoles

Declares roles.

DenyAll

Specifies that no security roles are allowed to invoke the specified methods.

PermitAll

Specifies that all security roles are allowed to invoke the specified methods.

RolesAllowed

Specifies the list of security roles that are allowed to invoke the methods in the application.

RunAs

Defines the identity of the application during execution in a J2EE container.


@Path("/helloUser")
@RolesAllowed({"ADMIN", "DEV"})
public class helloUser {

   @GET
   @Path("updateUser")  
   @Produces("text/plain")
   @RolesAllows("ADMIN")
   public String updateUser() {
      return "User Updated!";
   }
}

Updating the web.xml deployment descriptor to define security configuration


         
             Users
             /user
             GET
             POST
         
         
             admin 
         
    
        
            BASIC
            default
        
    
        admin
    
 
 

Thanks for viewing this post. If you like it don't forget to provide comments


AddToAny

Contact Form

Name

Email *

Message *