Server Troubleshooting

📄 KYP server troubleshooting

This article provides an overview of the error codes that may be encountered in the KYP User Interface (UI), including their criticality, impact, associated risks, and root causes. Then it provides information about how to approach trobuleshooting the problem as well as a couple of troubleshooting scenarios.

🧭 Error Code Reference Table

Error Code	Criticality	Impact	Risk Assessment	Root Cause
400	Medium	Frontend sends incorrectly formatted requests that cannot be processed by the server. Some frontend functionalities may become inoperable.	The risk depends on the functionality affected. Typically limited to specific features, making the overall risk medium.	Usually caused by mismatches between frontend and backend logic or errors in request forwarding configuration.
401	Medium	Backend recognizes requests as unauthorized, potentially making the entire application unusable as no requests can be processed.	Although occurrence is rare, the complete application inaccessibility results in medium risk when it happens.	Incorrect authorization token usage, often due to authorization logic issues on the frontend or backend side.
404	Medium	Specific frontend functionalities attempt to call backend endpoints that do not exist, rendering some features inaccessible and triggering visible error messages.	Typically, affects isolated features, usually easy to resolve, resulting in a lower risk level.	URL mapping misconfigurations or version mismatches between frontend and backend components.
500	High	The user is redirected to a generic error page ("Something went wrong") and cannot view certain frontend data.	Moderate likelihood with potentially significant impact if core functionalities are affected. Usually confined to individual backend modules.	General backend functionality error affecting isolated system parts.
502	Highly Critical	Backend is unresponsive, resulting in full application downtime.	Rare occurrence but critical impact due to potential backend infrastructure failure or service unavailability.	Severe backend failure or infrastructure issue affecting core systems.

🔧 Basic Troubleshooting

Below are the recommended steps to follow when encountering the above error codes:

1️⃣ Check the Healthcheck Dashboard
If any of the error codes occur, we recommend first reviewing the Healthcheck dashboard to verify:

Current server utilization is within normal parameters.
All containers are running as expected.

More information about the Healthcheck dashboard can be found in this article

2️⃣ Verify Containers with ctop
If any containers are down, you can use the ctop tool on the server side to review logs and restart the affected container if necessary.
Run the following command:

sudo ctop

A list of all running containers will be displayed. Select the affected container to:

Review its logs.
Restart the container using the available restart option if it is down.

🧩 Specific Troubleshooting Examples

🔎 Silent Reboots on Windows EC2 Instances (Windows server)

Overview
Windows EC2 instances may reboot without showing a blue screen or leaving crash logs. This is usually caused by misconfiguration (e.g., page file too small, automatic restart enabled) rather than an AWS platform issue.

Steps to Troubleshoot

Enable Crash Dumps
- Disable Automatically restart in System Properties.
- Set page file on C: (system managed, ≥ RAM size).
- Select Kernel memory dump.
Check Logs & Dumps
- Look for %SystemRoot%\MEMORY.DMP after a crash.
- Review Event Viewer for Event IDs 41, 1074, 6006, 6008.
Use AWS Tools
- Check EC2 Status Checks and CloudTrail for host issues.
- Configure CloudWatch Agent to monitor CPU, memory, and disk usage.

Key Takeaway
Silent reboots can usually be traced by enabling crash dumps, monitoring system metrics, and reviewing AWS logs. If unresolved, escalate with crash data to the DevOps or AWS support team.

🔹 Small Space on Root Partition

Steps to Resolve:

Check Available Space and Docker Size
```
df -h
du -sh /var/lib/docker
```
Stop All Containers (e.g., via ctop)
Stop Docker
```
systemctl stop docker
```
Verify Docker is Stopped
```
docker ps
```

Move Docker Data

mkdir /postgresdb/docker
rsync -avP /var/lib/docker /postgresdb/
mv /var/lib/docker /var/lib/docker_original
ln -s /postgresdb/docker /var/lib/docker

Start Docker
```
systemctl start docker
docker ps
```

❌ Docker Commands Not Working (Freeze)

Check Filesystem Type
```
df -Th
```
Supported: ext4, xfs
If Unsupported, Install fuse-overlays
```
apt install fuse-overlayfs
```
Update Docker Configuration
Edit /etc/docker/daemon.json:
```
{
  "storage-driver": "fuse-overlayfs"
}
```
Restart Docker
```
systemctl restart docker
```

🛠 Need Help?

If you have any questions or require support at any stage of the investigation, please contact our Support Team via the KYP.ai Support Portal.