Posts: 4
Threads: 1
Joined: Sep 2023
Sep-02-2023, 10:07 PM
(This post was last modified: Sep-02-2023, 10:07 PM by dududada.)
The current codes are already on the website: https://github.com/xunzheng/notears
linear.py allows me to run one-time simulation. https://github.com/xunzheng/notears/blob.../linear.py
Currently, we fix the n, d, s0, graph_type, sem_type = 100, 20, 20, 'ER', 'gauss'.
And I can get a result of SHD. This is one-time run / simulation. (Don't get confused with the above n. The n denotes the number of observations in one simulated data set.) Now what I want is doing 100 simulations, i.e., repeat 100 times of linear.py file. Then I take the average of SHD to get my final desired number.
I also want to compute the run-time of the whole simulation.
But I don't know how to fulfill my goal.
Posts: 6,778
Threads: 20
Joined: Feb 2020
Take all that stuff that is under "if __name__ == '__main__': and put it in a function called main. Now you can runt it as many times as you want.
You might want to modify the code to append result to your save files. or maybe you return the values instead of writing them to a file. Guess that depends on how you need to process the results, and how they are formatted.
Posts: 4
Threads: 1
Joined: Sep 2023
Sep-03-2023, 12:35 AM
(This post was last modified: Sep-03-2023, 02:12 AM by deanhystad.)
Thank you for your hint. That's helpful. Now I can update the code:
`
def run_experiment():
from notears import utils
# utils.set_random_seed(1) this line cannot be used to ensure different outcomes in each round of the loop
n, d, s0, graph_type, sem_type = 1000, 20, 20, 'ER', 'gauss'
B_true = utils.simulate_dag(d, s0, graph_type)
W_true = utils.simulate_parameter(B_true)
np.savetxt('W_true.csv', W_true, delimiter=',')
X = utils.simulate_linear_sem(W_true, n, sem_type)
np.savetxt('X.csv', X, delimiter=',')
W_est = notears_linear(X, lambda1=0.1, loss_type='l2')
assert utils.is_dag(W_est)
np.savetxt('W_est.csv', W_est, delimiter=',')
acc = utils.count_accuracy(B_true, W_est != 0)
print(acc)
if __name__ == '__main__':
num_experiments = 3
for _ in range(num_experiments):
run_experiment()
`
So the above code can give me each three rounds of the loop. But I want to get the average value of shd of the whole loop, rather than seeing the outcome of each round of the loop.
deanhystad write Sep-03-2023, 02:12 AM:Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Posts: 4
Threads: 1
Joined: Sep 2023
(Sep-02-2023, 10:38 PM)deanhystad Wrote: Take all that stuff that is under "if __name__ == '__main__': and put it in a function called main. Now you can runt it as many times as you want.
You might want to modify the code to append result to your save files. or maybe you return the values instead of writing them to a file. Guess that depends on how you need to process the results, and how they are formatted.
Thank you for your hint. I have a follow-up question.
Posts: 6,778
Threads: 20
Joined: Feb 2020
What do you want to average, acc?
def run_experiment():
from notears import utils
# utils.set_random_seed(1) this line cannot be used to ensure different outcomes in each round of the loop
n, d, s0, graph_type, sem_type = 1000, 20, 20, 'ER', 'gauss'
B_true = utils.simulate_dag(d, s0, graph_type)
W_true = utils.simulate_parameter(B_true)
np.savetxt('W_true.csv', W_true, delimiter=',')
X = utils.simulate_linear_sem(W_true, n, sem_type)
np.savetxt('X.csv', X, delimiter=',')
W_est = notears_linear(X, lambda1=0.1, loss_type='l2')
assert utils.is_dag(W_est)
np.savetxt('W_est.csv', W_est, delimiter=',')
return utils.count_accuracy(B_true, W_est != 0)
if __name__ == '__main__':
acc = [run_experiment() for _ in range(num_experiments)]
print("Average acc =", sum(acc) / len(acc)) You should not use assert.
Posts: 4
Threads: 1
Joined: Sep 2023
(Sep-03-2023, 02:17 AM)deanhystad Wrote: What do you want to average, acc?
def run_experiment():
from notears import utils
# utils.set_random_seed(1) this line cannot be used to ensure different outcomes in each round of the loop
n, d, s0, graph_type, sem_type = 1000, 20, 20, 'ER', 'gauss'
B_true = utils.simulate_dag(d, s0, graph_type)
W_true = utils.simulate_parameter(B_true)
np.savetxt('W_true.csv', W_true, delimiter=',')
X = utils.simulate_linear_sem(W_true, n, sem_type)
np.savetxt('X.csv', X, delimiter=',')
W_est = notears_linear(X, lambda1=0.1, loss_type='l2')
assert utils.is_dag(W_est)
np.savetxt('W_est.csv', W_est, delimiter=',')
return utils.count_accuracy(B_true, W_est != 0)
if __name__ == '__main__':
acc = [run_experiment() for _ in range(num_experiments)]
print("Average acc =", sum(acc) / len(acc)) You should not use assert.
Yes, I want to get the average value of acc. The current acc gives me a list like this: {'fdr': 0.0, 'tpr': 0.95, 'fpr': 0.0, 'shd': 1, 'nnz': 19}
Posts: 6,778
Threads: 20
Joined: Feb 2020
Sep-03-2023, 01:43 PM
(This post was last modified: Sep-03-2023, 03:26 PM by deanhystad.)
That is a dictionary, not a list. Do you know what those fields mean? What is the math to compute the average?
|