Python Forum
Machine learning SQL injection detection
Thread Rating:
  • 2 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Machine learning SQL injection detection
#1
Good day, i am a post graduate student working on "Detecting and preventing SQL injection attack on a database using machine learning approach". My Major challenge right now is generating the dataset and how to write the appropriate code in Python, i will be highly grateful if you can help me out in any way you can, thanks alot.
Reply
#2
I moved the thread to News and Discussions sub-forum, because it looks more appropriate for general discussion on possible approach. My understanding is you don't have code/specific questions yet
Reply
#3
First off, hope you are using the latest version of Python (3.6.3).  You might use Python's builtin sqlite3 to create your test database.  Once created, make a backup copy so you always have a pristine copy and one you can attack. I've seen people working with databases of thousands of entries, when really all you need is a minimal amount. In your case, probably 2-5 entries would be enough to initially test the actual code for injection/detection. Once satisfied, you can always increase the size of the database or even try it against other databases.

If you run into problems, either with the database or the program, we are here to help.  Be sure and read the section of our Help document on BBCode before you post your code, errors and output.
If it ain't broke, I just haven't gotten to it yet.
OS: Windows 10, openSuse 42.3, freeBSD 11, Raspian "Stretch"
Python 3.6.5, IDE: PyCharm 2018 Community Edition
Reply
#4
I have a code for url malicious detection, but i want this code rewritten for SQL injection detection, pls can any one in the house help. The code is here below, thanks

import pandas as pd
import numpy as np
import random

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

urls_data = pd.read_csv("data.csv")
type(urls_data)
urls_data.head()
def makeTokens(f):
    tkns_BySlash = str(f.encode('utf-8')).split('/')
    total_Tokens = []
    for i in tkns_BySlash:
        tokens = str(i).split('-')
        tkns_ByDot = []
        for j in range(0, len(tokens)):
            temp_Tokens = str(tokens[j]).split('.')
            tkns_ByDot = tkns_ByDot + temp_Tokens
        total_Tokens = total_Tokens + tokens + tkns_ByDot
    total_Tokens = list(set(total_Tokens))
    if 'com' in total_Tokens:
        total_Tokens.remove('com')
    return total_Tokens
y = urls_data["label"]
url_list = urls_data["url"]
vectorizer = TfidfVectorizer(tokenizer=makeTokens)
x = vectorizer.fit_transform(url_list)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
logit = LogisticRegression()
logit.fit(x_train, y_train)
print ("Accuracy ", logit.score(x_test, y_test))
x_predict = ["http://www.psn.com.pk/",
"google.com/search=faizanahmad",
"www.radsport-voggel.de/wp-admin/includes/log.exe",
"www.radsport-voggel.de/wp-admin/includes/an/log.exe",
"www.google.com",
"www.google-scholar.com/wp-good"]
x_predict = vectorizer.transform(x_predict)
New_predict = logit.predict(x_predict)
print(New_predict)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Feature Selection in Machine Learning shiv11 3 1,664 Dec-01-2023, 08:56 AM
Last Post: JiahMehra
  [machine learning] identifying a number 0-9 from a 28x28 picture, not working SheeppOSU 0 1,823 Apr-09-2021, 12:38 AM
Last Post: SheeppOSU
  Getting started in Machine Learning Harshil 5 3,169 Dec-07-2020, 04:06 PM
Last Post: sridhar
  Python Machine Learning: For Data Extraction JaneTan 0 1,803 Nov-24-2020, 06:45 AM
Last Post: JaneTan
  IndexError in Array while trying to do machine learning Mariaoye 0 1,863 Nov-12-2020, 12:35 AM
Last Post: Mariaoye
  Errors with Machine Learning trading bot-- not sure why MattKahn13 0 1,348 Aug-07-2020, 08:19 PM
Last Post: MattKahn13
  How useful is PCA for machine learning? Marvin93 0 1,510 Aug-07-2020, 02:07 PM
Last Post: Marvin93
  How to extract data from paragraph using Machine Learning with python? bccsthilina 2 3,007 Jul-27-2020, 07:02 AM
Last Post: hussainmujtaba
  Machine Learning: Process Enanda 13 4,190 Mar-18-2020, 02:02 AM
Last Post: jefsummers
  Machine Learning Polynomial Regression braveYug 0 1,686 Nov-13-2019, 11:41 AM
Last Post: braveYug

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020