MLS-C01 Exploratory Data Analysis • Complete Question Bank
Complete MLS-C01 Exploratory Data Analysis question bank — all 0 questions with answers and detailed explanations.
Refer to the exhibit.
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-bucket/training/*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::my-bucket/training/"
}
]
}
```Refer to the exhibit.
```
# S3 Select query result on a CSV file
SELECT * FROM s3object s WHERE s."age" > 30 AND s."city" = 'New York'
# Result:
{
"Payload": [
{"Records": {"Payload": "name,age,city\nAlice,35,New York\nBob,40,New York\n"}},
{"Stats": {"Details": {"BytesScanned": 1024, "BytesProcessed": 512, "BytesReturned": 64}}}
]
}
```Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag a concept onto its matching description — or click a concept then click the description.
Managed compute to train a model
Host a model for real-time inference
Run inference on a batch of data
Jupyter notebook for exploration
Run data processing scripts
Drag a concept onto its matching description — or click a concept then click the description.
Model performs well on training data but poorly on unseen data
Model fails to capture underlying patterns in data
Error from wrong assumptions in the learning algorithm
Error from sensitivity to small fluctuations in training data
Balance between underfitting and overfitting
Refer to the exhibit.
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
},
{
"Effect": "Deny",
"Action": "s3:*",
"Resource": "arn:aws:s3:::my-bucket/confidential/*",
"Condition": {
"StringNotEquals": {
"aws:sourceVpce": "vpce-12345678"
}
}
}
]
}
```Refer to the exhibit. ``` ERROR: Could not read CSV file 's3://bucket/data.csv': Error: (103) The CSV file contains a row with 5 fields, but the header has 4 fields. Row 1502: "2023-01-15","A","B","C","D" ```
Refer to the exhibit.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/training/*"
},
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::my-bucket",
"Condition": {
"StringLike": {
"s3:prefix": "training/*"
}
}
}
]
}Refer to the exhibit.
CloudWatch Logs snippet:
2023-07-01T10:00:00 ERROR: Model training failed: ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Traceback:
File "train.py", line 45, in <module>
model.fit(X_train, y_train)
File "sklearn/linear_model/_logistic.py", line 1523, in fit
...Refer to the exhibit. Consider the following AWS CLI output from an Amazon Athena query:
QueryExecutionId: "12345678-1234-1234-1234-123456789012"
Query: "SELECT COUNT(*) FROM my_table WHERE col1 IS NULL"
Status: "SUCCEEDED"
ResultConfiguration:
OutputLocation: "s3://my-bucket/athena-results/"
ResultSet:
Rows:
- Data:
- VarCharValue: "_col0"
- Data:
- VarCharValue: "5000"Refer to the exhibit. Consider the following IAM policy attached to a SageMaker notebook instance:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-training-data/*",
"arn:aws:s3:::my-training-data"
]
},
{
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryResults"
],
"Resource": "*"
}
]
}Refer to the exhibit. Consider the following output from an Amazon SageMaker Data Wrangler data quality report: Column: 'age' Missing: 2.3% Mean: 38.5 Median: 37.0 StdDev: 15.2 Min: 0 Max: 120 Unique: 85
Refer to the exhibit.
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}
```Refer to the exhibit. ``` 2019-09-01 12:00:01 ERROR 500 Server Error: /api/v1/users 2019-09-01 12:00:02 INFO 200 OK: /api/v1/users 2019-09-01 12:00:03 ERROR 500 Server Error: /api/v1/users ```
Refer to the exhibit.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::data-bucket/*",
"Condition": {
"StringEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
}
}
]
}Refer to the exhibit.
{
"Logs": [
{
"timestamp": "2023-10-01T10:00:00Z",
"level": "ERROR",
"message": "NullPointerException: Cannot invoke method on null object"
},
{
"timestamp": "2023-10-01T10:01:00Z",
"level": "ERROR",
"message": "NullPointerException: Cannot invoke method on null object"
},
{
"timestamp": "2023-10-01T10:02:00Z",
"level": "WARN",
"message": "Connection timeout"
},
{
"timestamp": "2023-10-01T10:03:00Z",
"level": "ERROR",
"message": "NullPointerException: Cannot invoke method on null object"
}
]
}Refer to the exhibit.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::data-bucket",
"arn:aws:s3:::data-bucket/*"
]
},
{
"Effect": "Deny",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::data-bucket/sensitive/*"
}
]
}Refer to the exhibit. CloudWatch Logs Insights query: fields @timestamp, @message | filter @message like /ERROR/ | stats count() by bin(1h) | sort @timestamp desc | limit 10
Refer to the exhibit.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-data-bucket/*",
"arn:aws:s3:::my-data-bucket"
]
},
{
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryResults"
],
"Resource": "*"
}
]
}Refer to the exhibit. [ERROR] 2023-01-15T10:30:00.000Z 12345678-1234-1234-1234-123456789012 Task timed out after 300.00 seconds [ERROR] 2023-01-15T10:35:00.000Z 12345678-1234-1234-1234-123456789012 Task timed out after 300.00 seconds
Refer to the exhibit.
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
],
"Condition": {
"StringEquals": {
"s3:ExistingObjectTag/data-type": "training"
}
}
}
]
}
```Refer to the exhibit. ``` aws s3 ls s3://my-bucket/data/ 2024-01-01 12:00:00 1024 train.csv 2024-01-01 12:00:01 2048 test.csv 2024-01-01 12:00:02 512 validation.csv ```
Refer to the exhibit.
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}
```Refer to the exhibit. ``` $ cat /var/log/syslog | grep "OutOfMemory" 2024-01-15 10:30:45 ERROR OutOfMemoryError: Java heap space at org.apache.spark.sql.catalyst.expressions.GenerateMutableProjection.apply(Unknown Source) ```
Refer to the exhibit.
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::data-lake-bucket",
"arn:aws:s3:::data-lake-bucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryResults"
],
"Resource": "*"
}
]
}
```Refer to the exhibit. ``` # CloudWatch Logs Insights query fields @timestamp, @message | filter @message like /ERROR/ | stats count() by bin(5m) | sort @timestamp desc ```
Refer to the exhibit.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::data-lake-bucket",
"arn:aws:s3:::data-lake-bucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"glue:GetTable",
"glue:GetPartitions"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryResults"
],
"Resource": "arn:aws:athena:us-east-1:123456789012:workgroup/primary"
}
]
}{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-bucket/*",
"arn:aws:s3:::my-bucket"
]
}
]
}{"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::my-bucket/*"}, {"Effect": "Allow", "Action": ["sagemaker:CreateProcessingJob"], "Resource": "*"}]}