Bottom Page

Thread Rating:
  • 2 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Machine learning SQL injection detection
#1
Good day, i am a post graduate student working on "Detecting and preventing SQL injection attack on a database using machine learning approach". My Major challenge right now is generating the dataset and how to write the appropriate code in Python, i will be highly grateful if you can help me out in any way you can, thanks alot.
Quote
#2
I moved the thread to News and Discussions sub-forum, because it looks more appropriate for general discussion on possible approach. My understanding is you don't have code/specific questions yet
Quote
#3
First off, hope you are using the latest version of Python (3.6.3).  You might use Python's builtin sqlite3 to create your test database.  Once created, make a backup copy so you always have a pristine copy and one you can attack. I've seen people working with databases of thousands of entries, when really all you need is a minimal amount. In your case, probably 2-5 entries would be enough to initially test the actual code for injection/detection. Once satisfied, you can always increase the size of the database or even try it against other databases.

If you run into problems, either with the database or the program, we are here to help.  Be sure and read the section of our Help document on BBCode before you post your code, errors and output.
buran likes this post
If it ain't broke, I just haven't gotten to it yet.
OS: Windows 10, openSuse 42.3, freeBSD 11, Raspian "Stretch"
Python 3.6.5, IDE: PyCharm 2018 Community Edition
Quote
#4
I have a code for url malicious detection, but i want this code rewritten for SQL injection detection, pls can any one in the house help. The code is here below, thanks


import pandas as pd
import numpy as np
import random

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

urls_data = pd.read_csv("data.csv")
type(urls_data)
urls_data.head()
def makeTokens(f):
    tkns_BySlash = str(f.encode('utf-8')).split('/')
    total_Tokens = []
    for i in tkns_BySlash:
        tokens = str(i).split('-')
        tkns_ByDot = []
        for j in range(0, len(tokens)):
            temp_Tokens = str(tokens[j]).split('.')
            tkns_ByDot = tkns_ByDot + temp_Tokens
        total_Tokens = total_Tokens + tokens + tkns_ByDot
    total_Tokens = list(set(total_Tokens))
    if 'com' in total_Tokens:
        total_Tokens.remove('com')
    return total_Tokens
y = urls_data["label"]
url_list = urls_data["url"]
vectorizer = TfidfVectorizer(tokenizer=makeTokens)
x = vectorizer.fit_transform(url_list)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
logit = LogisticRegression()
logit.fit(x_train, y_train)
print ("Accuracy ", logit.score(x_test, y_test))
x_predict = ["http://www.psn.com.pk/",
"google.com/search=faizanahmad",
"www.radsport-voggel.de/wp-admin/includes/log.exe",
"www.radsport-voggel.de/wp-admin/includes/an/log.exe",
"www.google.com",
"www.google-scholar.com/wp-good"]
x_predict = vectorizer.transform(x_predict)
New_predict = logit.predict(x_predict)
print(New_predict)

Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Errors with Machine Learning trading bot-- not sure why MattKahn13 0 93 Aug-07-2020, 08:19 PM
Last Post: MattKahn13
  How useful is PCA for machine learning? Marvin93 0 137 Aug-07-2020, 02:07 PM
Last Post: Marvin93
  How to extract data from paragraph using Machine Learning with python? bccsthilina 2 642 Jul-27-2020, 07:02 AM
Last Post: hussainmujtaba
  Machine Learning: Process Enanda 13 865 Mar-18-2020, 02:02 AM
Last Post: jefsummers
  Machine Learning Polynomial Regression braveYug 0 341 Nov-13-2019, 11:41 AM
Last Post: braveYug
  Ask for machine learning Python example with 2 data files user5566b 2 477 Sep-05-2019, 12:15 PM
Last Post: user5566b
  Using machine learning kingrayd 1 721 Apr-09-2019, 09:29 AM
Last Post: micseydel
  Python - Database for Machine Learning application braveYug 2 1,052 Mar-04-2019, 10:30 AM
Last Post: Shruti7109162
  About Machine Learning rajeev1729 3 1,930 Mar-01-2019, 07:41 AM
Last Post: rajeev1729
  PyCM 1.8 released: Machine learning library for confusion matrix statistical analysis sepandhaghighi 0 772 Jan-05-2019, 12:36 PM
Last Post: sepandhaghighi

Forum Jump:


Users browsing this thread: 1 Guest(s)